WO2018171531A1 - Système et procédé de prédiction de classification pour un objet - Google Patents

Système et procédé de prédiction de classification pour un objet Download PDF

Info

Publication number
WO2018171531A1
WO2018171531A1 PCT/CN2018/079348 CN2018079348W WO2018171531A1 WO 2018171531 A1 WO2018171531 A1 WO 2018171531A1 CN 2018079348 W CN2018079348 W CN 2018079348W WO 2018171531 A1 WO2018171531 A1 WO 2018171531A1
Authority
WO
WIPO (PCT)
Prior art keywords
objects
classification
label
user
probability vector
Prior art date
Application number
PCT/CN2018/079348
Other languages
English (en)
Inventor
Zhiwei QIN
Chengxiang ZHUO
Wei Tan
Original Assignee
Beijing Didi Infinity Technology And Development Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Didi Infinity Technology And Development Co., Ltd. filed Critical Beijing Didi Infinity Technology And Development Co., Ltd.
Priority to CN201880020197.1A priority Critical patent/CN110447039A/zh
Publication of WO2018171531A1 publication Critical patent/WO2018171531A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques

Definitions

  • the present disclosure generally relates to a data processing system, and more particularly, relates to a system and method for predicting a classification for an object.
  • Predictive analytics may analyze the plurality of data sets associated with a plurality of current and/or historical facts to make predictions about future or otherwise unknown objects.
  • Predictive analytics may be used in a plurality of applications, for example, analytical customer relationship management, clinical decision support, collection analytics, cross sell, customer retentions, direct marketing, fraud detection, etc.
  • Predicting a classification for an object may be a common issue in predictive analytics.
  • Internet based on-demand services such as online taxi-calling services, have become increasingly popular because of their convenience of using the services.
  • a predicted classification of an object may facilitate an on-demand service system to provide better services for the object.
  • An on-demand service system may predict a classification for an object by itself and/or make use of the predicted classification.
  • a variety of statistical techniques e.g., predictive modeling, machine learning, data mining, etc.
  • label noise may be used in object classification.
  • label noise may induce a potential negative consequence in object classification. Accordingly, the accuracy for predicting and/or classifying unlabeled objects may be relatively low. It is desirable to provide a system and method to improve the accuracy and efficiency of predicting a classification for an object.
  • a method may be implemented on at least one device each of which has at least one processor and a storage for predicting a classification for an object.
  • the method may include one or more of the following operations.
  • the at least one processor may obtain data relating to a first set of objects.
  • the first set of objects may include a plurality of first objects and a plurality of second objects.
  • Each of the plurality of first objects may include a class label and each of the plurality of second objects may be unlabeled.
  • the at least one processor may determine a predicted label for each of the first set of objects based on the data relating to the first set of objects and a label propagation algorithm.
  • the at least one processor may determine an initial label transformation matrix with respect to the first set of objects based on the class labels and the predicted labels associated therewith.
  • the at least one processor may obtain one or more subsets of second objects by sampling the plurality of second objects.
  • the at least one processor may generate one or more combined subsets of objects based on the one or more subsets of second objects and the plurality of first objects.
  • the at least one processor may determine a classification prediction model and an updated label transformation matrix associated with each of the one or more combined subsets of objects based on a label noise-tolerant classification algorithm and the initial label transformation matrix.
  • the at least one processor may predict a classification for at least one of the plurality of second objects based on the classification prediction model and the updated label transformation matrix associated with each of the one or more combined subsets of objects.
  • a system may include at least one computer-readable storage medium including a set of instructions for managing supply of services.
  • the at least one processor may be in communication with the at least one storage medium. When executing the set of instructions, the at least one processor may be directed to perform one or more of the following operations.
  • the at least one processor may obtain data relating to a first set of objects.
  • the first set of objects may include a plurality of first objects and a plurality of second objects.
  • Each of the plurality of first objects may include a class label and each of the plurality of second objects may be unlabeled.
  • the at least one processor may determine a predicted label for each of the first set of objects based on the data relating to the first set of objects and a label propagation algorithm.
  • the at least one processor may determine an initial label transformation matrix with respect to the first set of objects based on the class labels and the predicted labels associated therewith.
  • the at least one processor may obtain one or more subsets of second objects by sampling the plurality of second objects.
  • the at least one processor may generate one or more combined subsets of objects based on the one or more subsets of second objects and the plurality of first objects.
  • the at least one processor may determine a classification prediction model and an updated label transformation matrix associated with each of the one or more combined subsets of objects based on a label noise-tolerant classification algorithm and the initial label transformation matrix.
  • the at least one processor may predict a classification for at least one of the plurality of second objects based on the classification prediction model and the updated label transformation matrix associated with each of the one or more combined subsets of objects.
  • a non-transitory computer readable medium may include at least one set of instructions for providing an on-demand service.
  • the at least one set of instructions may be executed by a processor.
  • the at least one processor may be directed to perform one or more of the following operations.
  • the at least one processor may obtain data relating to a first set of objects.
  • the first set of objects may include a plurality of first objects and a plurality of second objects.
  • Each of the plurality of first objects may include a class label and each of the plurality of second objects may be unlabeled.
  • the at least one processor may determine a predicted label for each of the first set of objects based on the data relating to the first set of objects and a label propagation algorithm.
  • the at least one processor may determine an initial label transformation matrix with respect to the first set of objects based on the class labels and the predicted labels associated therewith.
  • the at least one processor may obtain one or more subsets of second objects by sampling the plurality of second objects.
  • the at least one processor may generate one or more combined subsets of objects based on the one or more subsets of second objects and the plurality of first objects.
  • the at least one processor may determine a classification prediction model and an updated label transformation matrix associated with each of the one or more combined subsets of objects based on a label noise-tolerant classification algorithm and the initial label transformation matrix.
  • the at least one processor may predict a classification for at least one of the plurality of second objects based on the classification prediction model and the updated label transformation matrix associated with each of the one or more combined subsets of objects.
  • the data relating to the first set of objects may include characteristic data associated with each of the first set of objects and relationship data among the first set of objects.
  • the at least one processor may transform the characteristic data associated with each of the first set of objects into a characteristic vector to obtain a plurality of characteristic vectors, determine one or more cosine similarity values between two related characteristic vectors based on the relationship data and the plurality of characteristic vectors, and determine the predicted label for each of the first set of objects by propagating a plurality of class labels of the plurality of first objects to the first set of objects based on the one or more cosine similarity values.
  • the one or more subsets of second objects may include distinct samples of the plurality of second objects.
  • each of the one or more subsets of second objects may include a percentage of samples from the plurality of second objects.
  • the at least one processor may generate each of the one or more combined subsets of objects by combining each of the one or more subsets of second objects with the plurality of first objects.
  • the at least one processor may determine at least one first classification probability vector with respect to the at least one of the plurality of second objects based on the classification prediction model and characteristic data associated with the at least one of the plurality of second objects, determine at least one second classification probability vector with respect to the at least one of the plurality of second objects based on the updated label transformation matrix and the predicted label of the at least one of the plurality of second objects, and determine the classification for the at least one of the plurality of second objects based on the at least one first classification probability vector and the at least one second classification probability vector.
  • the at least one processor may determine a target classification probability vector based on the at least one first classification probability vector and the at least one second classification probability vector, and designate a label relating to a maximal value of the target classification probability vector as the classification for the at least one of the plurality of second objects.
  • the at least one processor may designate one or more mean values with respect to the at least one first classification probability vector and the at least one second classification probability vector as one or more elements of the target classification probability vector, or designate a weighted sum of the at least one first classification probability vector and the at least one second classification probability vector as the target classification probability vector.
  • each of the first set of objects may be a user.
  • the classification of the each of the first set of objects may include at least one of an age group of the user, a travel preference of the user, a travel time of the user, a consumption level of the user, or a propensity to consume of the user.
  • each of the first set of objects may be a user.
  • the data relating to the first set of objects may include characteristic data associated with the each of the first set of objects, and the characteristic data associated with the each of the first set of objects may include at least one of first piece of information relating to one or more historical travel locations of the user, or second piece of information relating to one or more applications installed on a terminal device associated with the user.
  • each of the first set of objects may be a user.
  • the data relating to the first set of objects may include relationship data, and the relationship data may include at least one of third piece of information relating to sending of one or more red packets between two or more of the first set of objects, or fourth piece of information relating to one or more friendships between two or more of the first set of objects.
  • the label noise-tolerant classification algorithm may be robust multiclass logistic regression algorithm.
  • the initial label transformation matrix may indicate a probability that a class label associated with the plurality of first objects is transformed to a predicted label associated with the plurality of first objects.
  • FIG. 1 is a schematic diagram of an exemplary on-demand service system according to some embodiments of the present disclosure
  • FIG. 2 is a block diagram of an exemplary mobile device configured to implement a specific system disclosed in the present disclosure
  • FIG. 3 is a block diagram illustrating an exemplary computing device according to some embodiments of the present disclosure
  • FIG. 4 is a block diagram illustrating an exemplary processing engine according to some embodiments of the present disclosure
  • FIG. 5 is a block diagram illustrating an exemplary predicted label determination module according to some embodiments of the present disclosure
  • FIG. 6 is a block diagram illustrating an exemplary combined subset determination module according to some embodiments of the present disclosure
  • FIG. 7 is a block diagram illustrating an exemplary classification predicting module according to some embodiments of the present disclosure.
  • FIG. 8 is a flowchart of an exemplary process for predicting a classification for an unlabeled object according to some embodiments of the present disclosure
  • FIG. 9 is a flowchart of an exemplary process for predicting a classification for an unlabeled object according to some embodiments of the present disclosure.
  • FIG. 10A is a schematic diagram illustrating an exemplary process for determining a predicted label for each of the first set of objects according to some embodiments of the present disclosure
  • FIG. 10B is a schematic diagram illustrating an exemplary process for generating one or more combined subsets of objects according to some embodiments of the present disclosure
  • FIG. 10C is a schematic diagram illustrating an exemplary process for determining one or more classification prediction models and one or more updated label transformation matrixes according to some embodiments of the present disclosure
  • FIG. 11 is a flowchart of an exemplary process for determining a predicted label for each of the first set of objects according to some embodiments of the present disclosure.
  • FIG. 12 is a flowchart of an exemplary process for predicting a classification for an unlabeled object based on one or more classification prediction models and one or more updated label transformation matrixes according to some embodiments of the present disclosure.
  • system, ” “module, ” and/or “block” used herein are one method to distinguish different components, elements, parts, section or assembly of different level in ascending order. However, the terms may be displaced by another expression if they achieve the same purpose.
  • module refers to logic embodied in hardware or firmware, or to a collection of software instructions.
  • a module or a block described herein may be implemented as software and/or hardware and may be stored in any type of non-transitory computer-readable medium or another storage device.
  • a software module/unit/block may be compiled and linked into an executable program. It will be appreciated that software modules can be callable from other modules/units/blocks or themselves, and/or may be invoked in response to detected events or interrupts.
  • Software modules/units/blocks configured for execution on computing devices (e.g., the processor 320 as illustrated in FIG.
  • a computer-readable medium such as a compact disc, a digital video disc, a flash drive, a magnetic disc, or any other tangible medium, or as a digital download (and can be originally stored in a compressed or installable format that needs installation, decompression, or decryption prior to execution) .
  • a computer-readable medium such as a compact disc, a digital video disc, a flash drive, a magnetic disc, or any other tangible medium, or as a digital download (and can be originally stored in a compressed or installable format that needs installation, decompression, or decryption prior to execution) .
  • Such software code may be stored, partially or fully, on a storage device of the executing computing device, for execution by the computing device.
  • Software instructions may be embedded in firmware, such as an Electrically Programmable Read-Only-Memory (EPROM) .
  • EPROM Electrically Programmable Read-Only-Memory
  • modules/units/blocks may be included in connected logic components, such as gates and flip-flops, and/or can be included in programmable units, such as programmable gate arrays or processors.
  • the modules/units/blocks or computing device functionality described herein may be implemented as software modules/units/blocks but may be represented in hardware or firmware.
  • the modules/units/blocks described herein refer to logical modules/units/blocks that may be combined with other modules/units/blocks or divided into sub-modules/sub-units/sub-blocks despite their physical organization or storage. The description may apply to a system, an engine, or a portion thereof.
  • module or block when a module or block is referred to as being “connected to, ” or “coupled to, ” another module, or block, it may be directly connected or coupled to, or communicate with the other module, or block, or an intervening unit, engine, module, or block may be present, unless the context clearly indicates otherwise.
  • the term “and/or” includes any and all combinations of one or more of the associated listed items.
  • the flowcharts used in the present disclosure illustrate operations that systems implement according to some embodiments of the present disclosure. It is to be expressly understood, the operations of the flowcharts may be implemented not in order. Conversely, the operations may be implemented in inverted order, or simultaneously. Moreover, one or more other operations may be added to the flowcharts. One or more operations may be removed from the flowcharts.
  • the system or method of the present disclosure may be applied to any other kind of online on-demand service.
  • the system or method of the present disclosure may be applied to different transportation systems including land, ocean, aerospace, or the like, or any combination thereof.
  • the vehicle of the transportation systems may include a taxi, a private car, a hitch, a bus, a train, a bullet train, a high speed rail, a subway, a vessel, an aircraft, a spaceship, a hot-air balloon, a driverless vehicle, a bicycle, a tricycle, a motorcycle, or the like, or any combination thereof.
  • the system or method of the present disclosure may be applied to taxi hailing, chauffeur services, delivery service, carpool, bus service, take-out service, driver hiring, vehicle hiring, bicycle sharing service, train service, subway service, shuttle services, location service, or the like.
  • the system or method of the present disclosure may be applied to shopping service, learning service, fitness service, financial service, social service, or the like.
  • the application scenarios of the system or method of the present disclosure may include a web page, a plug-in of a browser, a client terminal, a custom system, an internal analysis system, an artificial intelligence robot, or the like, or any combination thereof.
  • the object of the on-demand service may be any product.
  • the product may be a tangible product or an immaterial product.
  • the tangible product may include food, medicine, commodity, chemical product, electrical appliance, clothing, car, housing, luxury, or the like, or any combination thereof.
  • the immaterial product may include a servicing product, a financial product, a knowledge product, an internet product, or the like, or any combination thereof.
  • the internet product may include an individual host product, a web product, a mobile internet product, a commercial host product, an embedded product, or the like, or any combination thereof.
  • the mobile internet product may be used in a software of a mobile terminal, a program, a system, or the like, or any combination thereof.
  • the mobile terminal may include a tablet computer, a laptop computer, a mobile phone, a personal digital assistance (PDA) , a smart watch, a point of sale (POS) device, an onboard computer, an onboard television, a wearable device, or the like, or any combination thereof.
  • the product may be any software and/or application used in the computer or mobile phone.
  • the software and/or application may relate to socializing, shopping, transporting, entertainment, learning, investment, or the like, or any combination thereof.
  • the software and/or application relating to transporting may include a traveling software and/or application, a vehicle scheduling software and/or application, a mapping software and/or application, etc.
  • the vehicle may include a horse, a carriage, a rickshaw (e.g., a wheelbarrow, a bike, a tricycle, etc. ) , a car (e.g., a taxi, a bus, a private car, etc. ) , a train, a subway, a vessel, an aircraft (e.g., an airplane, a helicopter, a space shuttle, a rocket, a hot-air balloon, etc. ) , or the like, or any combination thereof.
  • a horse e.g., a carriage, a rickshaw (e.g., a wheelbarrow, a bike, a tricycle, etc. )
  • a car e.g., a taxi, a bus, a private car, etc.
  • a train e.g., a subway, a vessel, an aircraft (e.g., an airplane, a helicopter, a space shuttle, a rocket, a hot-air balloon, etc.
  • user in the present disclosure may refer to an individual, an entity, or a tool that may request a service, order a service, provide a service, or facilitate the providing of the service.
  • user terminal may be used interchangeably.
  • An aspect of the present disclosure relates to systems and methods for predicting a classification for an object.
  • the system may obtain data relating to a first set of objects.
  • the first set of objects may include a plurality of first objects and a plurality of second objects.
  • Each of the plurality of first objects may include a class label and each of the plurality of second objects may be unlabeled.
  • the system may determine a predicted label for each of the first set of objects based on the data relating to the first set of objects and a label propagation algorithm.
  • the system may determine an initial label transformation matrix with respect to the first set of objects based on the class labels and the predicted labels associated therewith.
  • the system may obtain one or more subsets of second objects by sampling the plurality of second objects.
  • the system may generate one or more combined subsets of objects based on the one or more subsets of second objects and the plurality of first objects.
  • the system may determine a classification prediction model and an updated label transformation matrix associated with each of the one or more combined subsets of objects based on a label noise-tolerant classification algorithm and the initial label transformation matrix.
  • the system may predict a classification for at least one of the plurality of second objects based on the classification prediction model and the updated label transformation matrix associated with each of the one or more combined subsets of objects.
  • FIG. 1 is a block diagram of an exemplary on-demand service system 100 according to some embodiments.
  • the on-demand service system 100 may be an online on-demand service system for transportation services (e.g., taxi hailing, chauffeur services, delivery services, carpool, bus services, take-out services, driver hiring, vehicle hiring, train services, subway services, shuttle services) , shopping services, fitness services, learning services, financial services, or the like.
  • transportation services e.g., taxi hailing, chauffeur services, delivery services, carpool, bus services, take-out services, driver hiring, vehicle hiring, train services, subway services, shuttle services
  • shopping services e.g., fitness services, learning services, financial services, or the like.
  • the on-demand service system 100 may include a server 110, a network 120, one or more user terminals (e.g., one or more passenger terminals 130, driver terminals 140) , and a storage 150.
  • a server 110 may include a server 110, a network 120, one or more user terminals (e.g., one or more passenger terminals 130, driver terminals 140) , and a storage 150.
  • user terminals e.g., one or more passenger terminals 130, driver terminals 140
  • storage 150 e.g., a storage 150.
  • the server 110 may include a processing engine 112. It should be noted that the on-demand service system 100 shown in FIG. 1 is merely an example, and not intended to be limiting. In some embodiments, the on-demand service system 100 may include the passenger terminal (s) 130 or the driver terminal (s) 140. In some embodiments, the on-demand service system 100 may determine or predict a classification for an object (e.g., a passenger associated with a passenger terminal 130, a driver associated with a driver terminal 140, etc. ) . In some embodiments, the on-demand service system 100 may acquire information relating to the classification for an object and/or provide customized service for the object based on the classification.
  • a classification for an object e.g., a passenger associated with a passenger terminal 130, a driver associated with a driver terminal 140, etc.
  • the on-demand service system 100 may acquire information relating to the classification for an object and/or provide customized service for the object based on the classification.
  • the server 110 may be a single server, or a server group.
  • the server group may be centralized, or distributed (e.g., server 110 may be a distributed system) .
  • the server 110 may be local or remote.
  • the server 110 may access information and/or data stored in the one or more user terminals (e.g., the one or more passenger terminals 130, driver terminals 140) , and/or the storage 150 via the network 120.
  • the server 110 may be directly connected to the one or more user terminals (e.g., the one or more passenger terminals 130, driver terminals 140) , and/or the storage 150 to access stored information and/or data.
  • the server 110 may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
  • the server 110 may be implemented on a computing device 300 having one or more components illustrated in FIG. 3 in the present disclosure.
  • the server 110 may include a processing engine 112.
  • the processing engine 112 may process information and/or data relating to one or more objects.
  • the processing engine 112 may determine or predict one or more classifications for one or more objects by processing information and/or data relating to the objects.
  • the objects may include one or more users (e.g., passengers, drivers, etc. ) .
  • the processing engine 112 may determine or predict an age group or gender for the users.
  • the processing engine 112 may include one or more processing engines (e.g., signal-core processing engine (s) or multi-core processor (s) ) .
  • the processing engine 112 may include a central processing unit (CPU) , an application-specific integrated circuit (ASIC) , an application-specific instruction-set processor (ASIP) , a graphics processing unit (GPU) , a physics processing unit (PPU) , a digital signal processor (DSP) , a field-programmable gate array (FPGA) , a programmable logic device (PLD) , a controller, a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • ASIP application-specific instruction-set processor
  • GPU graphics processing unit
  • PPU physics processing unit
  • DSP digital signal processor
  • FPGA field-programmable gate array
  • PLD programmable logic device
  • controller a microcontroller unit, a reduced instruction-set computer (RISC) , a microprocessor, or the like, or any combination thereof.
  • RISC reduced
  • the network 120 may facilitate exchange of information and/or data.
  • one or more components in the on-demand service system 100 e.g., the server 110, the one or more passenger terminals 130 the one or more driver terminal 140, or the storage 150
  • the server 110 may send information and/data to other component (s) in the on-demand service system 100 via the network 120.
  • the server 110 may obtain/acquire service request from the passenger terminal 130 via the network 120.
  • the server 110 may receive information relating to one or more objects from the storage 150 directly or via the network 120.
  • the server 110 may receive information relating to one or more objects from the passenger terminal 130 and/or the driver terminal 140 via the network 120.
  • the network 120 may be any type of wired or wireless network, or any combination thereof.
  • the network 120 may include a cable network, a wireline network, an optical fiber network, a telecommunications network, an intranet, an internet, a local area network (LAN) , a wide area network (WAN) , a wireless local area network (WLAN) , a metropolitan area network (MAN) , a wide area network (WAN) , a public telephone switched network (PTSN) , a Bluetooth network, a ZigBee network, a near field communication (NFC) network, or the like, or any combination thereof.
  • the network 120 may include one or more network access points.
  • the network 120 may include wired or wireless network access points such as base stations and/or internet exchange points 120-1, 120-2, ..., through which one or more components of the on-demand service system 100 may be connected to the network 120 to exchange data and/or information.
  • the passenger terminal 130 may include a mobile device 130-1, a tablet computer 130-2, a laptop computer 130-3, a built-in device in a motor vehicle 130-4, or the like, or any combination thereof.
  • the mobile device 130-1 may include a smart home device, a wearable device, a smart mobile device, a virtual reality device, an augmented reality device, or the like, or any combination thereof.
  • the smart home device may include a smart lighting device, a control device of an intelligent electrical apparatus, a smart monitoring device, a smart television, a smart video camera, an interphone, or the like, or combination thereof.
  • the wearable device may include a smart bracelet, a smart footgear, a smart glass, a smart helmet, a smart watch, a smart clothing, a smart backpack, a smart accessory, or the like, or any combination thereof.
  • the smart mobile device may include a smartphone, a personal digital assistance (PDA) , a gaming device, a navigation device, a point of sale (POS) device, or the like, or any combination.
  • the virtual reality device and/or the augmented reality device may include a virtual reality helmet, a virtual reality glass, a virtual reality patch, an augmented reality helmet, an augmented reality glass, an augmented reality patch, or the like, or any combination thereof.
  • the virtual reality device and/or the augmented reality device may include a Google Glass, an Oculus Rift, a Hololens, a Gear VR, etc.
  • built-in device in the motor vehicle 130-4 may include an onboard computer, an onboard television, etc.
  • the passenger terminal 130 may be a device with positioning technology for locating the position of the service requester and/or the passenger terminal 130.
  • the driver terminal 140 may be similar to, or the same device as the passenger terminal 130. In some embodiments, the driver terminal 140 may be a device with positioning technology for locating the position of the driver and/or the driver terminal 140. In some embodiments, the passenger terminal 130 and/or the driver terminal 140 may communicate with other positioning device to determine the position of the service requester, the passenger terminal 130, the driver, and/or the driver terminal 140. In some embodiments, the passenger terminal 130 and/or the driver terminal 140 may send positioning information to the server 110.
  • the storage 150 may store data and/or instructions.
  • the data may be a training model, one or more training samples, historical orders, or the like, or a combination thereof.
  • the storage 150 may store data obtained from the one or more user terminals (e.g., the one or more passenger terminals 130, driver terminals 140) .
  • the storage 150 may store data and/or instructions that the server 110 may execute or use to perform exemplary methods described in the present disclosure.
  • the storage 150 may include a mass storage, a removable storage, a volatile read-and-write memory, a read-only memory (ROM) , or the like, or any combination thereof.
  • Exemplary mass storage may include a magnetic disk, an optical disk, a solid-state drives, etc.
  • Exemplary removable storage may include a flash drive, a floppy disk, an optical disk, a memory card, a zip disk, a magnetic tape, etc.
  • Exemplary volatile read-and-write memory may include a random access memory (RAM) .
  • Exemplary RAM may include a dynamic RAM (DRAM) , a double date rate synchronous dynamic RAM (DDR SDRAM) , a static RAM (SRAM) , a thyristor RAM (T-RAM) , and a zero-capacitor RAM (Z-RAM) , etc.
  • DRAM dynamic RAM
  • DDR SDRAM double date rate synchronous dynamic RAM
  • SRAM static RAM
  • T-RAM thyristor RAM
  • Z-RAM zero-capacitor RAM
  • Exemplary ROM may include a mask ROM (MROM) , a programmable ROM (PROM) , an erasable programmable ROM (EPROM) , an electrically-erasable programmable ROM (EEPROM) , a compact disk ROM (CD-ROM) , and a digital versatile disk ROM, etc.
  • the storage 150 may be implemented on a cloud platform.
  • the cloud platform may include a private cloud, a public cloud, a hybrid cloud, a community cloud, a distributed cloud, an inter-cloud, a multi-cloud, or the like, or any combination thereof.
  • the storage 150 may be connected to the network 120 to communicate with one or more components in the on-demand service system 100 (e.g., the server 110, the one or more user terminals, etc. ) .
  • One or more components in the on-demand service system 100 may access the data and/or instructions stored in the storage 150 via the network 120.
  • the storage 150 may be directly connected to or communicate with one or more components in the on-demand service system 100 (e.g., the server 110, the one or more user terminals, etc. ) .
  • the storage 150 may be part of the server 110.
  • one or more components in the on-demand service system 100 may have a permission to access the storage 150.
  • one or more components in the on-demand service system 100 may read and/or modify information relating to the service requester, driver, and/or the public when one or more conditions are met.
  • the server 110 may read and/or modify one or more users'information after a service.
  • the on-demand service system 100 is merely an example for illustrating an application of the processing engine 112 for determining or predicting a classification for an object.
  • the processing engine 112 may be implemented on one or more other systems (e.g., a customer relationship management system, a project risk management system, an education management system, etc. ) .
  • a customer relationship management system e.g., a customer relationship management system, a project risk management system, an education management system, etc.
  • the above description of the processing engine 112 and the on-demand service system 100 is provided for the purposes of illustration, and is not intended to limit the scope of the present disclosure.
  • FIG. 2 is a block diagram of an exemplary mobile device 200 configured to implement a specific system disclosed in the present disclosure.
  • a user terminal device configured to display and communicate information related to locations may be a mobile device 200.
  • the mobile device 200 may include but is not limited to a smartphone, a tablet computer, a music player, a portable game console, a GPS receiver, a wearable calculating device (e.g. glasses, watches, etc. ) , or the like.
  • the mobile device 200 may include one or more central processing units (CPUs) 240, one or more graphical processing units (GPUs) 230, a display 220, a memory 260, a communication unit 210, a storage unit 290, and one or more input/output (I/O) devices 250.
  • the mobile device 200 may also be any other suitable component that includes but is not limited to a system bus or a controller (not shown in FIG. 2) .
  • a mobile operating system 270 e.g. IOS, Android, Windows Phone, etc.
  • one or more applications 280 may be loaded from the storage unit 290 to the memory 260 and implemented by the CPUs 240.
  • the application 280 may include a browser or other mobile applications configured to receive and process information related to a query (e.g., a name of a location) inputted by a user in the mobile device 200.
  • the passenger/driver may obtain information related to one or more search results through the system I/O device 250, and provide the information to the server 110 and/or other modules or units of the on-demand service system 100 (e.g., the network 120) .
  • a computer hardware platform may be used as hardware platforms of one or more elements (e.g., the server 110 and/or other sections of the on-demand service system 100 described in FIG. 1 through FIG. 12) . Since these hardware elements, operating systems and program languages are common, it may be assumed that persons skilled in the art may be familiar with these techniques and they may be able to provide information required in the on-demand service according to the techniques described in the present disclosure.
  • a computer with user interface may be used as a personal computer (PC) , or other types of workstations or terminal devices. After being properly programmed, a computer with user interface may be used as a server. It may be considered that those skilled in the art may also be familiar with such structures, programs, or general operations of this type of computer device. Thus, extra explanations are not described for the Figures.
  • FIG. 3 is a block diagram illustrating exemplary hardware and software components of a computing device 300 on which the server 110, the one or more user terminals (e.g., the one or more passenger terminals 130, driver terminals 140) may be implemented according to some embodiments of the present disclosure.
  • the computing device 300 may be configured to perform one or more functions of the server 110, passenger terminal 130, and driver terminal 140 disclosed in this disclosure.
  • the processing engine 112 may be implemented on the computing device 300 and configured to perform functions of the processing engine 112 disclosed in this disclosure.
  • the computing device 300 may be a general-purpose computer or a special purpose computer, both may be used to implement an on-demand service system 100 for the present disclosure.
  • the computing device 300 may be used to implement any component of the on-demand service system 100 as described herein.
  • the processing engine 112 may be implemented on the computing device 300, via its hardware, software program, firmware, or a combination thereof.
  • only one such computer is shown, for convenience, the computer functions relating to the search service as described herein may be implemented in a distributed fashion on a number of similar platforms to distribute the processing load.
  • the computing device 300 may include COM ports 250 connected to and from a network connected thereto to facilitate data communications.
  • the computing device 300 may also include a processor 320, in the form of one or more processors, for executing program instructions.
  • the exemplary computer platform may include an internal communication bus 310, program storage and data storage of different forms, for example, a disk 370, and a read only memory (ROM) 330, or a random access memory (RAM) 340, for various data files to be processed and/or transmitted by the computer.
  • the exemplary computer platform may also include program instructions stored in the ROM 330, RAM 340, and/or other type of non-transitory storage medium to be executed by the processor 320.
  • the methods and/or processes of the present disclosure may be implemented as the program instructions.
  • the computing device 300 may also include an I/O component 360, supporting input/output between the computer and other components therein.
  • the computing device 300 may also receive programming and data via network communications.
  • the computing device 300 may also include a hard disk controller communicated with a hard disk, a keypad/keyboard controller communicated with a keypad/keyboard, a serial interface controller communicated with a serial peripheral equipment, a parallel interface controller communicated with a parallel peripheral equipment, a display controller communicated with a display, or the like, or any combination thereof.
  • a hard disk controller communicated with a hard disk
  • a keypad/keyboard controller communicated with a keypad/keyboard
  • a serial interface controller communicated with a serial peripheral equipment
  • a parallel interface controller communicated with a parallel peripheral equipment
  • a display controller communicated with a display, or the like, or any combination thereof.
  • the computing device 300 in the present disclosure may also include multiple CPUs and/or processors, thus operations and/or method steps that are performed by one CPU and/or processor as described in the present disclosure may also be jointly or separately performed by the multiple CPUs and/or processors.
  • the CPU and/or processor of the computing device 300 executes both step A and step B
  • step A and step B may also be performed by two different CPUs and/or processors jointly or separately in the computing device 300 (e.g., the first processor executes step A and the second processor executes step B, or the first and second processors jointly execute steps A and B) .
  • FIG. 4 is a block diagram illustrating an exemplary processing engine 112 according to some embodiments of the present disclosure.
  • the processing engine 112 may be in communication with a computer-readable storage (e.g., the storage 150, a user terminal (e.g., a passenger terminal 130, a driver terminal 140, etc. ) and may execute instructions stored in the computer-readable storage medium.
  • the processing engine 112 may include a data acquisition module 410, a predicted label determination module 420, a label transformation matrix (LTM) determination module 430, a combined subset determination module 440, a training module 450, and a classification predicting module 460.
  • LTM label transformation matrix
  • the data acquisition module 410 may be configured to obtain one or more objects (e.g., a first set of objects) and/or data relating to the objects.
  • an object may include a user, an event, a substance, or the like, or any combination thereof.
  • an object may be a user of a service system, for example, a passenger or driver in the on-demand service system 100, a registered member in a social network platform, a learner in an online education system, etc.
  • an object may be an event occurs via a service system, for example, a trip scheduled via the on-demand service system 100, a message interaction via a social network platform, a learning experience via an online education system, etc.
  • an object may be a substance involved in a service system, for example, a vehicle in the on-demand service system 100, a package in an express service system, etc. More descriptions of the object may be found elsewhere in the present disclosure, for example, FIG. 8 and the description thereof.
  • the data relating to the objects may include characteristic data and/or relationship data associated with the objects.
  • the data acquisition module 410 may obtain the characteristic data and/or the relationship data from a user terminal (e.g., the passenger terminal 130, the driver terminal 140) , the storage 150, and/or an external data source (not shown) .
  • the data acquisition module 410 may obtain the characteristic data and/or the relationship data via the network 120.
  • the predicted label determination module 420 may be configured to determine one or more predicted labels for one or more objects. For example, the predicted label determination module 420 may determine a predicted label for each of a first set of objects. The first set of objects may include a plurality of first objects and a plurality of second objects. Each of the plurality of first objects may include a class label and each of the plurality of second objects may be unlabeled. In some embodiments, the predicted label determination module 420 may determine a predicted label for an object based on the data relating to the object and/or a label propagation algorithm. More descriptions of the label propagation algorithm and/or the operation for determining the predicted label for one or more of the first set of objects may be found elsewhere in the present disclosure. See, for example, FIG. 11 and the description thereof.
  • the label transformation matrix (LTM) determination module 430 may be configured to determine an initial label transformation matrix for a second set of objects.
  • the second set of objects may include a portion or all of the first objects.
  • the LTM determination module 430 may determine the initial label transformation matrix based on the class labels of the second set of objects and the predicted labels of the second set of objects.
  • the LTM determination module 430 may determine a probability that a class label of each of the second set of objects is transformed to a predicted label of the each of the second set of objects. Thus, a plurality of probabilities with respect to the second set of objects may be determined.
  • the LTM determination module 430 may determine the initial label transformation matrix based on the plurality of probabilities.
  • the combined subset determination module 440 may be configured to generate one or more combined subsets. In some embodiments, the combined subset determination module 440 may generate the combined subset (s) based on the first objects and one or more subsets of second objects. In some embodiments, a combined subset may include one of the one or more subsets of the second objects and a portion of the first objects. In some embodiments, a combined subset may include one of the one or more subsets of the second objects and all of the first objects.
  • the training module 450 may be configured to determine one or more classification prediction models and/or one or more updated label transformation matrixes. For example, the training module 450 may determine a classification prediction model and an updated label transformation matrix corresponding to one of the one or more combined subsets based on an initial label transformation matrix. The training module 450 may process one or more combined subsets using a label noise-tolerant classification algorithm. In some embodiments, the label noise-tolerant classification algorithm may be a multiclass label-noise robust logistic regression algorithm. In some embodiments, the training module 450 may train one or more combined subsets and/or an initial label transformation matrix using a label noise-tolerant classification algorithm. More descriptions of a training process may be found elsewhere in the present disclosure, for example, FIG. 9 and the description thereof.
  • the classification predicting module 460 may be configured to predict a classification for an unlabeled object (e.g., a second object) .
  • the classification predicting module 460 may predict a classification for an unlabeled object based on one or more classification prediction models, one or more updated label transformation matrixes, and/or characteristic data of the unlabeled object.
  • the classification predicting module 460 may input data relating to the unlabeled object to the classification prediction model (s) and/or the label transformation matrix (es) , and predict the classification for the unlabeled object.
  • the training module 450 may include a correction unit (not shown) to correct and/or modify one or more classification prediction models and/or one or more updated label transformation matrixes. Similar modifications should fall within the scope of the present disclosure.
  • FIG. 5 is a block diagram illustrating an exemplary predicted label determination module 420 according to some embodiments of the present disclosure.
  • the predicted label determination module 420 may include a characteristic vector determination unit 510, a cosine similarity determination unit 520, and a label propagating unit 530.
  • the characteristic vector determination unit 510 may be configured to determine one or more characteristic vectors.
  • the characteristic vector determination unit 510 may transform characteristic data associated with an object (e.g., each of the first set of objects) into a characteristic vector.
  • Each element of a characteristic vector may correspond to a characteristic of an object.
  • the characteristic may include a first piece of information relating to one or more historical travel locations of an object, a second piece of information relating to one or more applications installed on a terminal device associated with the object, etc.
  • a value of each element of the characteristic vector may be 0, 1, or any number between 0 and 1.
  • the object may have one or more historical travel locations.
  • When a value of an element of the characteristic vector of an object is 0, it may indicate the fact that the object has the characteristic is false, i.e., the object does not have the characteristic.
  • the object may have no historical travel location.
  • the cosine similarity determination unit 520 may be configured to determine one or more cosine similarity values between two related characteristic vectors.
  • the cosine similarity determination unit 520 may determine the cosine similarity value (s) based on two or more characteristic vectors and/or the relationship data associated with a set of objects (e.g., the first set of objects) .
  • a set of objects e.g., the first set of objects
  • when two characteristic vectors are related it may indicate two objects corresponding to the two characteristic vectors have a direct relationship that can be determined according to the relationship data.
  • the label propagating unit 530 may be configured to determine a predicted label for an object (e.g., each of the first set of objects) .
  • the label propagating unit 530 may determine a predicted label by propagating the class labels of one or more first objects to the first set of objects based on one or more cosine similarity values and/or the relationship data among the first set of objects.
  • the label propagating unit 530 may determine the predicted label for each of the first set of objects using a label propagation algorithm. Using the label propagation algorithm, a class label of an object may be propagated to another object based on the class label and a cosine similarity value between two characteristic vectors corresponding to the two objects.
  • FIG. 6 is a block diagram illustrating an exemplary combined subset determination module 440 according to some embodiments of the present disclosure.
  • the combined subset determination module 440 may include a sampling unit 610 and a combining unit 620.
  • the sampling unit 610 may be configured to obtain one or more subsets of unlabeled objects (e.g., second objects) .
  • the sampling unit 610 may obtain the subset (s) of second objects by sampling one or more second objects.
  • each of the one or more subsets may include a percentage of the second objects.
  • the sampling unit 610 may obtain the one or more subsets in a single sampling.
  • the sampling unit 610 may obtain the one or more subsets by two or more times of sampling, where each subset may be obtained by each individual sampling.
  • the combining unit 620 may be configured to generate one or more combined subsets. In some embodiments, the combining unit 620 may generate a combined subset based on a subset of second objects and one or more first objects. In some embodiments, a combined subset may include a subset of the second objects and a portion of the first objects. In some embodiments, a combined subset may include a subset of the second objects and all of the first objects.
  • FIG. 7 is a block diagram illustrating an exemplary classification predicting module 460 according to some embodiments of the present disclosure.
  • the classification predicting module 460 may include a classification probability vector (CPV) determination unit 710 and a classification determination unit 720.
  • CPV classification probability vector
  • the classification probability vector (CPV) determination unit 710 may be configured to determine one or more first classification probability vectors and/or one or more second classification probability vectors. In some embodiments, the CPV determination unit 710 may determine at least one first classification probability vector based on at least one classification prediction model and the characteristic data of an unlabeled object (e.g., a second object) . In some embodiments, the CPV determination unit 710 may determine at least one second classification probability vector based on at least one updated label transformation matrix and a predicted label of an unlabeled object (e.g., a second object) .
  • the classification determination unit 720 may be configured to determine a target classification probability vector based on the at least one first classification probability vector and the at least one second classification probability vector. Further, the classification determination unit 720 may determine a predicted classification for an unlabeled object (e.g., a second object) based on the target classification probability vector. In some embodiments, the classification determination unit 720 may designate a label relating to a maximal value of the target classification probability vector as the classification for the unlabeled object.
  • FIG. 8 is a flowchart of an exemplary process 800 for predicting a classification for an unlabeled object according to some embodiments of the present disclosure.
  • the process 800 for predicting a classification for the unlabeled object may be implemented in the on-demand service system 100 as illustrated in FIG. 1.
  • the process 800 may be implemented in a user terminal (e.g., the passenger terminal 130, the driver terminal 140) and/or the server 110.
  • the process 800 may also be implemented as one or more instructions stored in the storage 150 and called and/or executed by the processing engine 112.
  • the operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 800 may be accomplished with one or more additional operations not described, and/or without one or more of the operations described. Additionally, the order in which the operations of the process 800 as illustrated in FIG. 8 and described below is not intended to be limiting.
  • an object may include a user, an event, a substance, or the like, or any combination thereof.
  • an object may be a user of a service system, for example, a passenger or driver in the on-demand service system 100, a registered member in a social network platform, a learner in an online education system, etc.
  • an object may be an event occurs via a service system, for example, a trip scheduled via the on-demand service system 100, a message interaction via a social network platform, a learning experience via an online education system, etc.
  • an object may be a substance involved in a service system, for example, a vehicle in the on-demand service system 100, a package in an express service system, etc.
  • the first set of objects may include one or more first objects and one or more second objects.
  • a first object may include a class label.
  • a second object may be unlabeled, i.e., the second object may include no class label.
  • a class label may indicate a classification of an object. If the object is a user, the classification of the object may include an age group of the user, a travel preference of the user, a travel time of the user, a consumption level of the user, a propensity to consume of the user, or the like, or any combination thereof. If the object is an event, the classification of the object may include a probability of the occurrence of the event, a time period in which the event may occur, a location where the event may occur, or the like, or any combination thereof.
  • the data relating to the first set of objects may include characteristic data associated with one or more (e.g., each) of the first set of objects and relationship data among the first set of objects.
  • the characteristic data associated with an object may indicate a characteristic of the object.
  • the characteristic data associated with the user may include first piece of information relating to one or more historical travel locations of the user, second piece of information relating to one or more applications installed on a terminal device (e.g., the passenger terminal 130, the driver terminal 140, etc. ) associated with the user, or the like, or any combination thereof.
  • the first piece of information relating to the historical travel location (s) of a user may be obtained from one or more applications for travelling that are registered by the user.
  • the relationship data among the first set of objects may indicate a plurality of relations among the first set of objects. If each of the first set of objects is a user, the relationship data between two users of the first set of objects may indicate the acquaintance level between the two users, i.e., whether the two users are friends and/or whether the two users are first-level friends or second-level friends, etc. In some embodiments, the relationship data may include third piece of information associated with one or more red packets exchanged (e.g. money transactions) between two or more of the first set of objects, or fourth piece of information relating to social network information of two or more of the first set of objects.
  • red packets exchanged e.g. money transactions
  • the third piece of information may be obtained based on a network application configured with currency transaction functions (e.g., WeChat TM , Alipay TM , etc. ) .
  • the fourth piece of information may be obtained based on a network application configured with social networking functions (e.g., a social application, a taxi-calling application, a video application, etc. ) . If each of the first set of objects is an event, the relationship data may indicate information related to two or more events of the first set of objects.
  • the information may be obtained based on a network application (e.g., an application illustrated above) .
  • the data relating to the first set of objects may be obtained from a user terminal (e.g., the passenger terminal 130, the driver terminal 140) , the storage 150, and/or an external data source (not shown) . In some embodiments, the data relating to the first set of objects may be obtained via the network 120.
  • the processing engine 112 may determine a predicted label for one or more of the first set of objects (e.g., one or more first objects and/or one or more second objects) .
  • the predicted label determination module 420 may determine a predicted label for each of the first set of objects (e.g., each first object and/or each second object) based on the data relating to the first set of objects obtained in 801 and/or a label propagation algorithm. More descriptions of the label propagation algorithm and/or the operation for determining the predicted label for one or more of the first set of objects may be found elsewhere in the present disclosure. See, for example, FIG. 11 and the description thereof.
  • the processing engine 112 may determine one or more classification prediction models and one or more label transformation matrixes based on the predicted label (s) determined in 803, the class labels of the first objects, the first set of objects, and/or a label noise-tolerant classification algorithm.
  • the classification prediction model (s) and label transformation matrix (es) may be determined by a training process. More descriptions of the training process and/or the label noise-tolerant classification algorithm may be found elsewhere in the present disclosure. See, for example, FIG. 9 and the description thereof.
  • the processing engine 112 may predict a classification for an unlabeled object (e.g., a second object) based on the classification prediction model (s) and the label transformation matrix (es) determined in 805.
  • the classification predicting module 460 may input data relating to the unlabeled object to the classification prediction model (s) and/or the label transformation matrix (es) , and predict the classification for the unlabeled object. More descriptions of the operation for predicting a classification for an unlabeled object may be found elsewhere in the present disclosure. See, for example, FIG. 12 and the description thereof.
  • FIG. 9 is a flowchart of an exemplary process 900 for predicting a classification for an unlabeled object according to some embodiments of the present disclosure.
  • the process 900 for predicting a classification for the unlabeled object may be implemented in the on-demand service system 100 as illustrated in FIG. 1.
  • the process 900 may be implemented in a user terminal (e.g., the passenger terminal 130, the driver terminal 140) and/or the server 110.
  • the process 900 may also be implemented as one or more instructions stored in the storage 150 and called and/or executed by the processing engine 112.
  • the operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 900 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 900 as illustrated in FIG. 9 and described below is not intended to be limiting.
  • the processing engine 112 may obtain a first set of objects including one or more first objects with class labels and one or more second objects that are unlabeled (see FIG. 10A) .
  • a class label may indicate a classification of a first object.
  • each of the first set of objects may be a user associated with an application (e.g., an online taxi-calling application) , an event, or the like.
  • the classification may include an age group of the user, a travel preference of the user, a travel time of the user, a consumption level of the user, or a propensity to consume of the user.
  • the classification may include a probability of the occurrence the event. More descriptions of the first set of objects and the classification of an object may be found elsewhere in the present disclosure. See, for example, FIG. 8 and the description thereof.
  • the first set of objects may be obtained from a user terminal (e.g., the passenger terminal 130, the driver terminal 140) , the storage 150, and/or an external data source (not shown) .
  • the first set of objects may be obtained via the network 120.
  • the processing engine 112 may obtain data relating to the first set of objects (e.g., characteristic data and relationship data associated with the first set of objects) .
  • the characteristic data and the relationship data may be obtained from a user terminal (e.g., the passenger terminal 130, the driver terminal 140) , the storage 150, and/or an external data source (not shown) .
  • the characteristic data and the relationship data may be obtained via the network 120. More descriptions of the characteristic data and the relationship data may be found elsewhere in the present disclosure. See, for example, FIG. 8 and the description thereof.
  • the first set of objects may include four users (e.g., user A, user B, user C, and user D) .
  • the user A is a first object with a class label indicating an already known age group of the user A.
  • the user B, user C, and user D are second objects that are unlabeled. That is, the processing engine 112 may not know a classification (e.g., an age group) of the user B, user C, and user D.
  • the processing engine 112 may obtain relationship data associated with the user A, user B, user C, and user D.
  • a relationship network may include the user A, user B, user C, and user D.
  • the relationship data may indicate that there are a first-degree or first-level friendship between the user A and the user B, a second-degree or second-level friendship between the user A and the user C, and a third-degree or third-level friendship between the user A and the user D. Then the processing engine 112 may predict the age group for each of user B, user C, and user D according to the age group of the user A and the first-degree or first-level friendship, the second-degree or second-level friendship, and the third-degree or third-level friendship, respectively.
  • the relationship data may indicate that there are a fourth-degree or fourth-level friendship between the user A and the user B, a fifth-degree or fifth-level friendship between the user B and the user C, and a sixth-degree or sixth-level friendship between the user C and the user D.
  • the processing engine 112 may predict the age group for the user B according to the age group of the user A and the fourth-degree or fourth-level friendship.
  • the processing engine 112 may predict the age group for the user C according to the age group of user B and the fifth-degree or fifth-level friendship.
  • the processing engine 112 may predict the age group for the user D according to the age group of user C and the sixth-degree or sixth-level friendship.
  • the processing engine 112 may determine a predicted label for each of the first set of objects based on the data relating to the first set of objects (e.g., the characteristic data and the relationship data) obtained in 903.
  • each first object may have a predicted label
  • each second object may have a predicted label (see FIG. 10A) .
  • a predicted label may indicate a predicted classification of an object of the first set of objects.
  • the processing engine 112 may determine the predicted label for each of the first set of objects using a label propagation algorithm.
  • each of the first objects may include a class label and a predicted label
  • each of the second objects may only include a predicted label. More descriptions of the process for determining the predicted label for each of the first set of objects may be found elsewhere in the present disclosure. See, for example, FIG. 11 and the description thereof.
  • the processing engine 112 may determine an initial label transformation matrix with respect to the first set of objects.
  • the initial label transformation matrix may relate to the second set of objects.
  • the second set of objects may include all the first objects with class labels of the first set of objects.
  • the second set of objects may include only a portion of the first objects.
  • the second set of objects may include a percentage of the first objects. For example, if the first set of objects include 1000 first objects and the percentage is 10%, then the second set of objects may include 100 (i.e., 1000 ⁇ 10%) first objects.
  • the processing engine 112 may determine the initial label transformation matrix with respect to the first set of objects based on the class labels and the predicted labels associated therewith. For example, the processing engine 112 (e.g., the LTM determination module 430) may determine the initial label transformation matrix based on the class labels of the second set of objects and the predicted labels of the second set of objects. In some embodiments, the processing engine 112 (e.g., the LTM determination module 430) may determine a probability that a class label of each of the second set of objects is transformed to a predicted label of the each of the second set of objects. Thus, a plurality of probabilities with respect to the second set of objects may be determined.
  • the processing engine 112 may determine the initial label transformation matrix based on the plurality of probabilities. In some embodiments, the processing engine 112 (e.g., the LTM determination module 430) may determine the initial label transformation matrix based on one or more algorithms (e.g., a counting algorithm) . For the purposes of illustration, taking four age groups (e.g., 10 ⁇ 20 years old, 20 ⁇ 30 years old, 30 ⁇ 40 years old, and 40 ⁇ 50 years old) as an example, the initial label transformation matrix for the second set of objects may be a 4 ⁇ 4 matrix. Each element of the initial label transformation matrix may represent a probability that the class label of a first object may transform to a predicted label.
  • age groups e.g., 10 ⁇ 20 years old, 20 ⁇ 30 years old, 30 ⁇ 40 years old, and 40 ⁇ 50 years old
  • a row or column of the initial label transformation matrix may include four elements, i.e., 0.1, 0.5, 0.4, and 0. Other rows or columns of the initial label transformation matrix may be determined in a similar way.
  • the processing engine 112 may obtain one or more subsets of second objects (e.g., n subsets of second objects) by sampling the second object (s) obtained in 901.
  • the number n may be an integer larger than 0 (e.g., 2, 3, 4, 5, etc. ) .
  • the second object (s) may be sampled randomly, sequentially, etc. If n is equal to or larger than 2, there may be no same object in each two of the n subsets. That is, the n subsets may include distinct samples of the plurality of second objects (see FIG. 10B) .
  • each of the n subsets may include a percentage of the second objects.
  • the n subsets may be obtained in a single sampling.
  • the n subsets may be obtained by n times of sampling, where each subset may be obtained by each individual sampling.
  • the processing engine 112 may generate one or more combined subsets of objects (e.g., n combined subsets of objects) based on the subsets of second objects obtained in 909 and the first object (s) obtained in 901.
  • a combined subset may include one of the n subsets of the second objects and a portion of the first objects.
  • a combined subset may include one of the n subsets of the second objects and all of the first objects (see FIG. 10B) .
  • the combined subset determination module 440 may obtain three subsets (e.g., a first subset M1, a second subset M2, and a third subset M3) .
  • a third set of objects including all of the first objects may be represented by D.
  • a first combined subsets F1 may be generated by combining the first subset M1 and the third set of objects D, and may be expressed as ⁇ M1, D ⁇ .
  • a second combined subsets F2 may be generated by combining the second subset M2 and the third set of objects D, and may be expressed as ⁇ M2, D ⁇ .
  • a third combined subsets F3 may be generated by combining the third subset M3 and the third set of objects D, and may be expressed as ⁇ M3, D ⁇ .
  • the processing engine 112 may determine a classification prediction model and an updated label transformation matrix associated with each of the one or more combined subsets of objects (e.g., n combined subsets of objects) based on the initial label transformation matrix determined in 907.
  • the processing engine 112 e.g., the training module 450
  • the training module 450 may determine n classification prediction models and n updated label transformation matrixes (see FIG.
  • the training module 450 may process each of the n combined subsets using a label noise-tolerant classification algorithm.
  • the label noise-tolerant classification algorithm may be a multiclass label-noise robust logistic regression algorithm.
  • multiple classification prediction models e.g., n classification prediction models may be determined to improve the accuracy for predicting a classification for an unlabeled object.
  • the training module 450 may train the n combined subsets and the initial label transformation matrix using any label noise-tolerant classification algorithm, for example, the robust multiclass logistic regression.
  • a training process may be implemented by an application (e.g., an application based on Spark (a cluster computing framework) ) installed on a user terminal (e.g., the passenger terminal 130, the driver terminal 140) or the server 110.
  • An exemplary training process may be described as below.
  • a classifier may be determined based on Equation (1) :
  • x q may refer to a characteristic vector of an object q
  • k may refer to a kth classification
  • K may refer to the number of classifications
  • w k may refer to the weight vector corresponding to class k.
  • a target function may be determined based on Equation (2) :
  • the maximum likelihood (ML) estimate of w k may be obtained by maximizing the data log-likelihood, as illustrated in Equation (3) :
  • L (w) may refer to a maximum likelihood (ML) estimate of w k .
  • the target function may be optimized based on the Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm to determine the updated label transformation matrixes.
  • L-BFGS Limited-memory Broyden-Fletcher-Goldfarb-Shanno
  • the L-BFGS algorithm may be realized based on a gradient g.
  • the gradient g may be described as Equation (4) :
  • ⁇ jk may refer to a label-flipping probability and may be described as Equation (5) :
  • Equation (6) Equation (6)
  • the processing engine 112 may predict a classification for at least one of the second objects (e.g., a second object Ai) based on the classification prediction model and the updated label transformation matrix associated with each of the combined subset (s) of objects (e.g., the n classification prediction models and/or the n updated label transformation matrixes determined in 913) .
  • the classification predicting module 460 may predict a classification for the second object Ai based on the n classification prediction models, the n updated label transformation matrixes, and/or the characteristic data of the second object Ai (see FIG. 10C) . More descriptions of the process for predicting a classification an unlabeled second object (e.g., the second object Ai) may be found elsewhere of the present disclosure. See, for example, FIG. 12 and the description thereof.
  • operations 901 and 903 may be integrated into one single operation.
  • a training set of objects e.g., the first objects with class labels
  • unlabeled objects e.g., the unlabeled second objects
  • the training set of objects may be expanded based on the relationship data (e.g., the information relating to sending of red packet (s) ) among a first plurality of passengers whose age group is already known and a second plurality of passengers whose age group is unknown.
  • the second plurality of passengers whose age group is unknown may have predicted labels (e.g., predicted age group (s) ) through label propagation, and thus, the second plurality of passengers may be added into the training set of objects.
  • FIG. 10A is a schematic diagram illustrating an exemplary process for determining a predicted label for each of the first set of objects according to some embodiments of the present disclosure.
  • the first set of objects may include one or more first objects 1010 and one or more second objects 1020.
  • the first objects 1010 may include a first object 1010-1, a first object 1010-2, ..., and a first object 1010-x.
  • Each of the first objects 1010 may include a class label.
  • the class label may be represented by “CL” (as shown in FIG. 10A) .
  • the second objects 1020 may include a second object 1020-1, a second object 1020-2, ..., and a second object 1010-y.
  • Each of the second objects 1020 may be unlabeled, i.e., each of the second objects 1020 may include no class label.
  • the number of the first objects 1010 and the number of the second objects 1020 may be the same or different.
  • Each of the first set of objects may obtain a predicted label through label propagating.
  • the predicted label may be represented by “PL” (as shown in FIG. 10A) .
  • the first objects 1030 may include the first objects 1010 and one or more predicted labels.
  • the first object 1010-1 with a class label and a predicted label may be represented by a first object 1030-1.
  • the first object 1010-2 with a class label and a predicted label may be represented by a first object 1030-2.
  • first object 1010-x with a class label and a predicted label may be represented by a first object 1030-x.
  • the second objects 1040 may include the second objects 1020 and one or more predicted labels.
  • the second object 1020-1 with a predicted label may be represented by a second object 1040-1.
  • the second object 1020-2 with a predicted label may be represented by a second object 1040-2.
  • the second object 1020-y with a predicted label may be represented by a second object 1040-y.
  • FIG. 10B is a schematic diagram illustrating an exemplary process for generating one or more combined subsets of objects according to some embodiments of the present disclosure.
  • the processing engine 112 e.g., the combined subset determination module 440
  • the processing engine 112 may obtain each subset by sampling one or more objects from the second objects 1040.
  • the subsets 1050 may include a first subset 1050-1, a second subset 1050-2, ..., and an nth subset 1050-n. There may be no same object in each two of the n subsets 1050. That is , the n subsets may include distinct samples of the plurality of second objects 1040.
  • each of the n subsets 1050 may be a percentage of the second objects 1040.
  • the processing engine 112 may generate one or more combined subsets 1060 based on the n subsets and the first objects 1030.
  • the combined subsets 1060 may include a first combined subset 1060-1, a second combined subset 1060-2, ..., and an nth combined subset 1060-n.
  • Each of the n combined subsets may include a portion or all of the first objects 1030 and one of the n subsets 1050.
  • the first combined subset 1060-1 may include the first subset 1050-1 and all the first objects 1030.
  • the second combined subset 1060-2 may include the second subset 1050-2 and all the first objects 1030.
  • the nth combined subset 1060-n may include the nth subset 1050-n and all the first objects 1030.
  • FIG. 10C is a schematic diagram illustrating an exemplary process for determining one or more classification prediction models and one or more updated label transformation matrixes according to some embodiments of the present disclosure.
  • the processing engine 112 e.g., the training model 450
  • the training model 450 may train the combined subsets 1060 and the initial label transformation matrix 1065 based on a label noise-tolerant classification algorithm.
  • the training model 450 may determine n classification prediction models1070 and n updated label transformation matrixes 1080 by processing each of the n combined subsets 1060 and the initial label transformation matrix1065.
  • the classification prediction models 1070 may include a first classification prediction model 1070-1, a second classification prediction model 1070-2, ..., and an nth classification prediction model 1070-n.
  • the updated label transformation matrixes 1080 may include a first updated label transformation matrix 1080-1, a second updated label transformation matrix 1080-2, ..., and an nth updated label transformation matrix 1080-n.
  • the training model 450 may determine the first classification prediction model 1070-1 and the first updated label transformation matrix 1080-1 by training the first combined subset 1060-1 and the initial label transformation matrix 1065 using the label noise-tolerant classification algorithm.
  • the training model 450 may determine the second classification prediction model 1070-2 and the second updated label transformation matrix 1080-2 by training the second combined subset 1060-2 and the initial label transformation matrix 1065 using the label noise-tolerant classification algorithm.
  • the training model 450 may determine the nth classification prediction model 1070-n and the nth updated label transformation matrix 1080-n by training the nth combined subset 1060-n and the initial label transformation matrix 1065 using the label noise-tolerant classification algorithm.
  • the processing engine 112 may predict a classification for a second object Ai by inputting characteristic data of the second object Ai to the n classification prediction models 1070 and inputting the predicted label of the second object Ai to the n updated label transformation matrixes 1080. Then a classification for the second object Ai may be determined.
  • the processing engine 112 may combine semi-supervised learning (e.g., label propagating) and supervised learning (e.g., training of classification prediction models) to predict a classification for an unlabeled object. As the predicted labels for unlabeled objects are combined into the supervised learning, and multiple classification prediction models are determined, the accuracy and/or stability for predicting a classification for an unlabeled object may be improved.
  • semi-supervised learning e.g., label propagating
  • supervised learning e.g., training of classification prediction models
  • FIG. 11 is a flowchart of an exemplary process 900 for determining a predicted label for each of the first set of objects according to some embodiments of the present disclosure.
  • the process 1100 for determining a predicted label for each of the first set of objects may be implemented in the system 100 as illustrated in FIG. 1.
  • the process 1100 may be implemented in a user terminal (e.g., the passenger terminal 130, the driver terminal 140) and/or the server 110.
  • the process 1100 may also be implemented as one or more instructions stored in the storage 150 and called and/or executed by the processing engine 112.
  • the operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 1100 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 1100 as illustrated in FIG. 11 and described below is not intended to be limiting.
  • the processing engine 112 may transform characteristic data associated with each of the first set of objects into a characteristic vector to obtain one or more characteristic vectors.
  • Each element of a characteristic vector may correspond to a characteristic of an object of the first set of objects.
  • the characteristic may include a first piece of information relating to one or more historical travel locations of an object, a second piece of information relating to one or more applications installed on a terminal device associated with the object, etc.
  • a value of each element of the characteristic vector may be 0, 1, or any number between 0 and 1.
  • the object may have one or more historical travel locations.
  • the object may have no historical travel location.
  • the processing engine 112 may determine one or more cosine similarity values between two related characteristic vectors based on the characteristic vector (s) obtained in 1101 and/or the relationship data among the first set of objects.
  • the processing engine 112 may determine one or more cosine similarity values between two related characteristic vectors based on the characteristic vector (s) obtained in 1101 and/or the relationship data among the first set of objects.
  • it may indicate two objects corresponding to the two characteristic vectors have a direct relationship that can be determined according to the relationship data.
  • the relationship data includes third piece of information relating to one or more red packets exchange between two or more of the first set of objects
  • two objects having a direct relationship may indicate that an object of the two objects has directly sent one or more red packets to the other object of the two objects.
  • the cosine similarity determination unit 520 may determine cosine similarity value (s) between each two related characteristic vectors.
  • the cosine similarity value (s) between two related characteristic vectors may be determined as Equation (7) :
  • a may refer to a first characteristic vector of two related characteristic vectors
  • b may refer to a second characteristic vector of the two related characteristic vectors
  • the processing engine 112 may determine a predicted label for each of the first set of objects by propagating one or more class labels of the first object (s) to the first set of objects based on the cosine similarity value (s) determined in 1103 and/or the relationship data among the first set of objects.
  • the label propagating unit 530 may determine the predicted label for each of the first set of objects using a label propagation algorithm. Using the label propagation algorithm, a class label of an object may be propagated to another object based on the class label and a cosine similarity value between two characteristic vectors corresponding to the two objects. For the purpose of illustration, an exemplary process with respect to the operations 1101 through 1105 may be described below.
  • each of the first set of objects may be a user, and a classification of the user may be an age group.
  • the classifications of the users may include four age groups, for example, 10 ⁇ 20 years old, 20 ⁇ 30 years old, 30 ⁇ 40 years old, and 40 ⁇ 50 years old.
  • the user A may have a class label indicating a classification of the age group of user A.
  • the user B, user C, and user D may have no class label, i.e., the user B, user C, and user D may be unlabeled.
  • the classification of the user A may be known as 30-40 years old, while the classification of the user B, user C, and user D may be unknown.
  • Relationship data may indicate a relationship network among the user A, user B, user C, and user D.
  • user A may have a direct relationship with each of user B, user C, and user D.
  • Characteristic data of the users A, B, C, and D may include information relating to one or more historical travel locations of the users.
  • the historical travel locations may be located in a city. Taking a city (e.g., Beijing) as an example, the city may include 100 travel locations.
  • the characteristic data of the users A, B, C, and D may indicate whether each of the users A, B, C, and D has traveled to at least one of the 100 travel locations in the city of Beijing.
  • the processing engine 112 may transform the characteristic data of the user to a characteristic vector. Each characteristic vector may have 100 elements corresponding to the 100 travel locations.
  • a value of an element associated with the one travel location of the characteristic vector may be 1. If the historical travel location of the user (e.g., user A, user B, user C, and user D) is not identical to one of the 100 travel locations, the value of an element associated with the one travel location of the characteristic vector may be 0.
  • the characteristic vector of user A may be [0, 0, 1, ..., 1, 0] .
  • Each of the users B, C, and D may have a corresponding characteristic vector, respectively.
  • a label probability matrix of user A may be [0, 0, 1, 0] .
  • a label probability matrix may indicate one or more probabilities that an object (e.g., the user A) may be classified as each of the classifications (e.g., the four age groups) .
  • a first element of the label probability matrix of user A may indicate a probability that user A is classified as a first age group (i.e., 10 ⁇ 20 years old) is 0.
  • a second element of the label probability matrix of user A may indicate a probability that user A is classified as a second age group (i.e., 20 ⁇ 30 years old) is 1.
  • a third element of the label probability of user A may indicate a probability that user A is classified as a third age group (i.e., 30 ⁇ 40 years old) is 0.
  • a fourth element of the label probability matrix of user A may indicate a probability that user A is classified as a fourth age group (i.e., 40 ⁇ 50 years old) is 0.
  • Each of the users B, C, and D may have a corresponding label probability matrix, respectively.
  • the processing engine 112 may determine a cosine similarity value between two related characteristic vectors associated with two of the users A, B, C, and D that have a direct relationship.
  • a cosine similarity value may indicate the intimacy of two users. If there is a direct relationship relating to one or more red packets exchange between the user A and user B, a cosine similarity value between the users A and B may be determined based on the characteristic vector of the user A and the characteristic vector of the user B. For example, the cosine similarity value between the user A and user B may be 0.7.
  • a label probability matrix of the user B may be determined based on the cosine similarity value (e.g., 0.7) and/or the label probability matrix (e.g., [0, 0, 1, 0] ) of the user A.
  • a label probability matrix of the user B may be [0.1, 0.1, 0.7, 0.1] .
  • the label probability matrix of the user B may indicate that the probability that the user B is classified as a first age group (i.e., 10 ⁇ 20 years old) is 0.1, the probability that the user B is classified as a second age group (i.e., 20 ⁇ 30 years old) is 0.1, the probability that the user B is classified as a third age group (i.e., 30 ⁇ 40 years old) is 0.7, and the probability that the user B is classified as a fourth age group (i.e., 40 ⁇ 50 years old) is 0.1.
  • a predicted label for the user B may be the third age group (i.e., 30 ⁇ 40 years old) .
  • a predicted label for the user C and user D may be determined.
  • the cosine similarity value between the user A and user C may be 0.76.
  • a label probability matrix of the user C may be determined as [0.08, 0.08, 0.76, 0.08] .
  • the label probability matrix of the user C may indicate that the probability that the user C is classified as a first age group (i.e., 10 ⁇ 20 years old) is 0.08, the probability that the user C is classified as a second age group (i.e., 20 ⁇ 30 years old) is 0.08, the probability that the user C is classified as a third age group (i.e., 30 ⁇ 40 years old) is 0.76, and the probability that the user C is classified as a fourth age group (i.e., 40 ⁇ 50 years old) is 0.08.
  • a predicted label for the user C may be the third age group (i.e., 30 ⁇ 40 years old) .
  • the cosine similarity value between the user A and user D may be 0.91.
  • a label probability matrix of the user D may be determined as [0.03, 0.03, 0.91, 0.03] .
  • the label probability matrix of the user D may indicate that the probability that the user D is classified as a first age group (i.e., 10 ⁇ 20 years old) is 0.03, the probability that the user D is classified as a second age group (i.e., 20 ⁇ 30 years old) is 0.03, the probability that the user D is a third age group (i.e., 30 ⁇ 40 years old) is 0.91, and the probability that the user D is classified as a fourth age group (i.e., 40 ⁇ 50 years old) is 0.03.
  • a predicted label for the user D may be the third age group (i.e., 30 ⁇ 40 years old) .
  • the label probability matrix of each of the users A, B, C, and D may be shown in Table 1.
  • Table 1 Label probability matrixes of different users
  • the user A may have a direct relationship with the user B and the user D.
  • the user B may have a direct relationship with the user C.
  • the user C may have a direct relationship with the user D.
  • the user A may have a class label indicating a classification of the age group of user A.
  • the user B, user C, and user D may have no class label, i.e., the user B, user C, and user D may be unlabeled.
  • the classification of the user A may be known as 30-40 years old, while the classification of the user B, user C, and user D may be unknown.
  • a label probability matrix of the user A may be [0, 0, 1, 0] .
  • each of the users A, B, C, and D may have a characteristic vector, respectively.
  • the processing engine 112 may determine a cosine similarity value between the user A and the user B, a cosine similarity value between the user B and the user C, and a cosine similarity value between the user C and the user D.
  • the cosine similarity value between the user A and the user B may be 0.2
  • the cosine similarity value between the user B and the user C may be 0.8
  • the cosine similarity value between the user C and the user D may be 0.4
  • the cosine similarity value between the user D and the user A may be 0.6.
  • a label probability matrix of the user B may be determined based on the cosine similarity value between the user A and the user B, the cosine similarity value between the user B and the user C, the label probability matrix of the user A, and the label probability matrix of the user C. Since the label probobility matrix of the user C is unknown, the processing engine 112 may set an initial matrix for the user C, for example, [0.25, 0.25, 0.25, 0.25] . Then a label probability matrix of the user B may be determined as [0.2, 0.2, 0.4, 0.2] .
  • a label probability matrix of the user C may be determined based on the cosine similarity value between the user B and the user C, the cosine similarity value between the user C and the user D, the label probability matrix of the user B, and the label probability matrix of the user D. Since the label probobility matrix of the user D is unknown, the processing engine 112 may set an initial matrix for the user D, for example, [0.25, 0.25, 0.25, 0.25] . Then a label probability matrix of the user C may be determined as [13/50, 13/50, 21/50, 13/50] . In some embodiments, the label probability matrix of the user C may be normalized as [13/60, 13/60, 21/60, 13/60] .
  • a label probability matrix of the user D may be determined based on the cosine similarity value between the user C and the user D, the cosine similarity value between the user D and the user A, the label probability matrix of the user A, and the label probability matrix of the user C. Then a label probability matrix of the user D may be determined as [13/150, 13/150, 111/150, 13/150] , as shown in Table 2.
  • the processing engine 112 may set an initial matrix for the user (e.g., [0.25, 0.25, 0.25, 0.25] ) .
  • the initial matrix may include default values determined by the on-demand service system 100 or may be preset by a user or operator via a terminal.
  • the label probobility matrixes of one or more users may be determined based on one or more iterations until convergence.
  • FIG. 12 is a flowchart of an exemplary process 1200 for predicting a classification for an unlabeled object based on one or more classification prediction models and one or more updated label transformation matrixes according to some embodiments of the present disclosure.
  • the process 1200 for predicting a classification for an unlabeled object may be implemented in the system 100 as illustrated in FIG. 1.
  • the process 1200 may be implemented in a user terminal (e.g., the passenger terminal 130, the driver terminal 140) and/or the server 110.
  • the process 1200 may also be implemented as one or more instructions stored in the storage 150 and called and/or executed by the processing engine 112.
  • the operations of the illustrated process presented below are intended to be illustrative. In some embodiments, the process 1200 may be accomplished with one or more additional operations not described, and/or without one or more of the operations discussed. Additionally, the order in which the operations of the process 1200 as illustrated in FIG. 12 and described below is not intended to be limiting.
  • the processing engine 112 may determine at least one first classification probability vector with respect to a second object (e.g., a second object Ai) .
  • the at least one first classification probability vector may be determined based on at least one classification prediction model and the characteristic data associated with the second object Ai.
  • the at least one classification prediction model may be selected from the n classification prediction models that are determined in 913 described in FIG. 9.
  • the number of the at least one first classification probability vector (or the at least one classification prediction model) may be an integer larger than 0.
  • the number of the at least one first classification probability vector (or the at least one classification prediction model) may be less than or equal to n.
  • the number of the at least one first classification probability vector and the number of the at least one classification prediction model may be the same.
  • the characteristic data associated with the second object Ai may be inputted into the at least one classification prediction model, and the at least one corresponding first classification probability vector may be determined.
  • the processing engine 112 may determine at least one second classification probability vector with respect to the second object (e.g., the second object Ai) .
  • the at least one second classification probability vector may be determined based on at least one updated label transformation matrix and the predicted label of the second object Ai.
  • the at least one updated label transformation matrix may be selected from the n updated label transformation matrixes that are determined in 913 described in FIG. 9.
  • the number of the at least one updated label transformation matrix (or the at least one second classification probability vector) may be an integer larger than 0.
  • the number of the at least one updated label transformation matrix (or the at least one second classification probability vector) may be less than or equal to n.
  • the number of the at least one updated label transformation matrix and the number of the at least one second classification probability vector may be the same. In some embodiments, the number of the at least one first classification probability vector and the number of the at least one second classification probability vector may be the same or different. In some embodiments, the at least one updated label transformation matrix may correspond to the at least one classification prediction model one by one. In some embodiments, the predicted label of the second object Ai may be inputted into the at least one updated label transformation matrix, and the at least one corresponding second classification probability vector may be determined. In some embodiments, a second classification probability vector and a first classification probability vector may have the same size (i.e., the same number of elements) .
  • the processing engine 112 may determine a target classification probability vector.
  • the target classification probability vector may be determined based on the at least one first classification probability vector determined in 1201 and the at least one second classification probability vector determined in 1203.
  • the processing engine 112 may determine a target classification probability vector by averaging the at least one first classification probability vector and the at least one second classification probability vector. That is, one or more mean values with respect to the at least one first classification probability vector and the at least one second classification probability vector may be designated as one or more elements of the target classification probability vector.
  • a mean value may be determined by averaging both elements that are located in a same position of the at least one first classification probability vector and the at least one second classification probability vector.
  • two first classification probability vectors may be [0.2, 0.2, 0.6, 0] and [0, 0.6, 0.4, 0] .
  • Two second classification probability vectors may be [0.4, 0, 0.6, 0] and [0.4, 0.2, 0, 0.4] .
  • the target classification probability vector may be [0.25, 0.25, 0.4, 0.1] by averaging the two first classification probability vectors and the two second classification probability vectors.
  • the processing engine 112 may determine the target classification probability vector based on a weighted sum of the at least one first classification probability vector and the at least one second classification probability vector. That is, a weighted sum of the at least one first classification probability vector and the at least one second classification probability vector may be designated as the target classification probability vector.
  • the processing engine 112 may determine a classification for the second object based on the target classification probability vector.
  • the classification determination unit 720 may designate a label relating to a maximal value of the target classification probability vector as the classification for the second object (e.g., the second object Ai) .
  • the target classification probability vector is [0.25, 0.25, 0.4, 0.1]
  • the maximal value of the target classification probability vector may be 0.4.
  • the classification for the second object Ai may be a classification corresponding to a label relating to a third element of the target classification probability vector [0.25, 0.25, 0.4, 0.1] .
  • each of the first set of objects may be a user, and the classifications of the users may include four age groups (e.g., 10 ⁇ 20 years old, 20 ⁇ 30 years old, 30 ⁇ 40 years old, and 40 ⁇ 50 years old) .
  • the classification for the second object Ai may be 30 ⁇ 40 years old based on the target classification probability vector [0.25, 0.25, 0.4, 0.1] .
  • the processing engine 112 described above is provided for the purposes of illustration, and not intended to limit the scope of the present disclosure. Especially, for persons having ordinary skills in the art, numerous variations and modifications may be conducted under the teaching of the present disclosure. However, those variations and modifications do not depart the protection scope of the present disclosure.
  • the maximal value of the target classification probability vector may correspond to two or more labels.
  • the target classification probability vector may be [0.4, 0.4, 0.2, 0] .
  • the processing engine 112 may designate a first label relating to the maximal value of the target classification probability vector as the classification for the second object Ai.
  • the second object Ai may be classified as 10 ⁇ 20 years old or 20 ⁇ 30 years old based on the target classification probability vector [0.4, 0.4, 0.2, 0] . Similar modifications should fall within the scope of the present disclosure.
  • aspects of the present disclosure may be illustrated and described herein in any of a number of patentable classes or context including any new and useful process, machine, manufacture, or composition of matter, or any new and useful improvement thereof. Accordingly, aspects of the present disclosure may be implemented entirely hardware, entirely software (including firmware, resident software, micro-code, etc. ) or combining software and hardware implementation that may all generally be referred to herein as a “module, ” “unit, ” “component, ” “device” or “system. ” Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.
  • a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including electro-magnetic, optical, or the like, or any suitable combination thereof.
  • a computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that may communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
  • Program code embodied on a computer readable signal medium may be transmitted using any appropriate medium, including wireless, wireline, optical fiber cable, RF, or the like, or any suitable combination of the foregoing.
  • Computer program code for carrying out operations for aspects of the present disclosure may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java, Scala, Smalltalk, Eiffel, JADE, Emerald, C++, C#, VB. NET, Python or the like, conventional procedural programming languages, such as the "C" programming language, Visual Basic, Fortran 2003, Perl, COBOL 2002, PHP, ABAP, dynamic programming languages such as Python, Ruby and Groovy, or other programming languages.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN) , or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider) or in a cloud computing environment or offered as a service such as a Software as a Service (SaaS) .
  • LAN local area network
  • WAN wide area network
  • SaaS Software as a Service

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

L'invention concerne un système et un procédé de prédiction d'une classification pour un objet. Des données se rapportant à un premier ensemble d'objets peuvent être obtenues. Le premier ensemble d'objets peut comprendre une pluralité de premiers objets (1010, 1030) et une pluralité de seconds objets (1020, 1040). Une étiquette prédite peut être déterminée pour chaque objet du premier ensemble d'objets. Une matrice de transformation d'étiquette initiale (1065) relativement à un premier ensemble d'objets peut être déterminée. Un ou plusieurs sous-ensembles (1050) de seconds objets (1020, 1040) peuvent être obtenus. Un ou plusieurs sous-ensembles combinés (1060) d'objets peuvent être générés. Un modèle de prédiction de classification (1070) et une matrice de transformation d'étiquette mise à jour (1080) associés à chacun desdits sous-ensembles combinés (1060) d'objets peuvent être déterminés. Une classification pour au moins un objet de la pluralité de seconds objets (1020, 1040) peut être prédite.
PCT/CN2018/079348 2017-03-23 2018-03-16 Système et procédé de prédiction de classification pour un objet WO2018171531A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201880020197.1A CN110447039A (zh) 2017-03-23 2018-03-16 预测对象类别的系统和方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710179031.1 2017-03-23
CN201710179031.1A CN108629358B (zh) 2017-03-23 2017-03-23 对象类别的预测方法及装置

Publications (1)

Publication Number Publication Date
WO2018171531A1 true WO2018171531A1 (fr) 2018-09-27

Family

ID=63585880

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/079348 WO2018171531A1 (fr) 2017-03-23 2018-03-16 Système et procédé de prédiction de classification pour un objet

Country Status (2)

Country Link
CN (2) CN108629358B (fr)
WO (1) WO2018171531A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132178A (zh) * 2020-08-19 2020-12-25 深圳云天励飞技术股份有限公司 对象分类方法、装置、电子设备及存储介质

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111611429B (zh) * 2019-02-25 2023-05-12 北京嘀嘀无限科技发展有限公司 数据标注方法、装置、电子设备及计算机可读存储介质
CN110060247B (zh) * 2019-04-18 2022-11-25 深圳市深视创新科技有限公司 应对样本标注错误的鲁棒深度神经网络学习方法
US11645693B1 (en) * 2020-02-28 2023-05-09 Amazon Technologies, Inc. Complementary consumer item selection
US11526700B2 (en) 2020-06-29 2022-12-13 International Business Machines Corporation Annotating unlabeled data using classifier error rates

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009045461A1 (fr) * 2007-10-03 2009-04-09 Siemens Medical Solutions Usa, Inc. Système et procédé de classification mixte faisant appel à des vignettes de groupe spatial des caractéristiques
CN103605990A (zh) * 2013-10-23 2014-02-26 江苏大学 基于图聚类标签传播的集成多分类器融合分类方法和系统
CN104750875A (zh) * 2015-04-23 2015-07-01 苏州大学 一种机器错误数据分类方法及系统
CN105930411A (zh) * 2016-04-18 2016-09-07 苏州大学 一种分类器训练方法、分类器和情感分类系统
CN106452809A (zh) * 2015-08-04 2017-02-22 北京奇虎科技有限公司 一种数据处理方法和装置

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8386574B2 (en) * 2009-10-29 2013-02-26 Xerox Corporation Multi-modality classification for one-class classification in social networks
CN104572733B (zh) * 2013-10-22 2019-03-15 腾讯科技(深圳)有限公司 用户兴趣标签分类的方法及装置
CN103714139B (zh) * 2013-12-20 2017-02-08 华南理工大学 一种移动海量客户群识别的并行数据挖掘方法
CN105446988B (zh) * 2014-06-30 2018-10-30 华为技术有限公司 预测类别的方法和装置
US10504035B2 (en) * 2015-06-23 2019-12-10 Microsoft Technology Licensing, Llc Reasoning classification based on feature pertubation
CN104915436A (zh) * 2015-06-24 2015-09-16 合肥工业大学 自适应多标签预测方法
CN105184326A (zh) * 2015-09-30 2015-12-23 广东工业大学 基于图数据的主动学习多标签社交网络数据分析方法
CN105608471B (zh) * 2015-12-28 2020-01-14 苏州大学 一种鲁棒直推式标签估计及数据分类方法和系统
CN106446191B (zh) * 2016-09-30 2019-11-05 浙江工业大学 一种基于Logistic回归的多特征网络流行标签预测方法
CN106504029A (zh) * 2016-11-08 2017-03-15 山东大学 一种基于客户群体行为分析的加油站销量预测方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2009045461A1 (fr) * 2007-10-03 2009-04-09 Siemens Medical Solutions Usa, Inc. Système et procédé de classification mixte faisant appel à des vignettes de groupe spatial des caractéristiques
CN103605990A (zh) * 2013-10-23 2014-02-26 江苏大学 基于图聚类标签传播的集成多分类器融合分类方法和系统
CN104750875A (zh) * 2015-04-23 2015-07-01 苏州大学 一种机器错误数据分类方法及系统
CN106452809A (zh) * 2015-08-04 2017-02-22 北京奇虎科技有限公司 一种数据处理方法和装置
CN105930411A (zh) * 2016-04-18 2016-09-07 苏州大学 一种分类器训练方法、分类器和情感分类系统

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112132178A (zh) * 2020-08-19 2020-12-25 深圳云天励飞技术股份有限公司 对象分类方法、装置、电子设备及存储介质
CN112132178B (zh) * 2020-08-19 2023-10-13 深圳云天励飞技术股份有限公司 对象分类方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN110447039A (zh) 2019-11-12
CN108629358B (zh) 2020-12-25
CN108629358A (zh) 2018-10-09

Similar Documents

Publication Publication Date Title
US10254119B2 (en) Systems and methods for recommending an estimated time of arrival
US10883842B2 (en) Systems and methods for route searching
US20200011692A1 (en) Systems and methods for recommending an estimated time of arrival
US10922778B2 (en) Systems and methods for determining an estimated time of arrival
EP3479306B1 (fr) Procédé et système pour estimer une heure d'arrivée
US20200134648A1 (en) Methods and systems for preventing user churn
WO2018171531A1 (fr) Système et procédé de prédiction de classification pour un objet
CN109478275B (zh) 分配服务请求的系统和方法
WO2018227389A1 (fr) Systèmes et procédés de détermination d'heure d'arrivée estimée
US11546729B2 (en) System and method for destination predicting
TWI675184B (zh) 用於路線規劃的系統、方法及非暫時性電腦可讀取媒體
CN111316308B (zh) 用于识别错误订单请求的系统及方法
US20200234391A1 (en) Systems and methods for online to offline service
WO2019015661A1 (fr) Systèmes et procédés d'allocation de demandes de service
WO2018223331A1 (fr) Systèmes et procédés de détermination d'attribut de texte à l'aide d'un modèle de champ aléatoire conditionnel
CN111274472A (zh) 信息推荐方法、装置、服务器及可读存储介质
WO2021056127A1 (fr) Systèmes et procédés d'analyse de sentiments
WO2018184395A1 (fr) Systèmes et procédés de recommandation d'activité
US11120091B2 (en) Systems and methods for on-demand services
US20210064669A1 (en) Systems and methods for determining correlative points of interest associated with an address query
WO2022087767A1 (fr) Systèmes et procédés de recommandation d'emplacements de prélèvement

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18771848

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18771848

Country of ref document: EP

Kind code of ref document: A1