WO2022271431A1 - Système et procédé qui classent des entreprises dans l'environnement, le social et la gouvernance (esg) - Google Patents

Système et procédé qui classent des entreprises dans l'environnement, le social et la gouvernance (esg) Download PDF

Info

Publication number
WO2022271431A1
WO2022271431A1 PCT/US2022/032134 US2022032134W WO2022271431A1 WO 2022271431 A1 WO2022271431 A1 WO 2022271431A1 US 2022032134 W US2022032134 W US 2022032134W WO 2022271431 A1 WO2022271431 A1 WO 2022271431A1
Authority
WO
WIPO (PCT)
Prior art keywords
esg
data
sentence
website
yielding
Prior art date
Application number
PCT/US2022/032134
Other languages
English (en)
Inventor
Jingtao Jonathan YAN
Alla KRAMSKAIA
Rochelle March
Original Assignee
The Dun And Bradstreet Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Dun And Bradstreet Corporation filed Critical The Dun And Bradstreet Corporation
Publication of WO2022271431A1 publication Critical patent/WO2022271431A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0637Strategic management or analysis, e.g. setting a goal or target of an organisation; Planning actions based on goals; Analysis or evaluation of effectiveness of goals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations
    • G06Q10/06393Score-carding, benchmarking or key performance indicator [KPI] analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/018Certifying business or products

Definitions

  • the present disclosure relates to environmental, social and governance metrics (ESG), and more particularly, to a technique for developing an ESG rankings dataset and generating an ESG score for a business.
  • ESG environmental, social and governance metrics
  • ESG data tends to capture extra-financial factors that were traditionally absent in financial analysis, such as company management of energy and water use, waste generation, employee rights and working conditions, community engagement, data privacy rights, and more traditional indicators of corporate accountability and transparency. While ESG is traditionally not seen as material to business outcomes, evidence increasingly shows that there is a strengthening financial relationship to it.
  • Alpha is a measurement of the performance of a stock in relation to the overall market. The exact relationship is inconclusive, but ESG has become a popular strategy for identifying additional alpha and managing market volatility. For example, in April 2020, at the start of the COVID-19 recession, multiple ESG funds experienced smaller downfalls than those of common benchmarks such as the S&P 500®. In a world that has changed considerably since the profit-prioritizing Industrial Revolution, it is fitting that a new genre of company analysis via ESG factors can guide us.
  • ESG data has evolved considerably since the early days of socially responsible investing, when negative screenings eliminated investment in controversial sectors such as tobacco, alcohol, gambling, and weapons.
  • ESG scores on companies are primarily derived from company disclosure, whether from annual reports, ESG reports (also labeled as sustainability, corporate social responsibility, or impact reports), and financial filings. Because of this, updating of ESG data is limited to yearly cycles as new reports are published and this data is collected. While company disclosure has increased, it remains non-standardized and even rare for ESG data, and providers may use varying factors for calculating the same ESG topics (e.g., workplace health and safety). Several ESG factors, particularly for environmental impacts, are often modeled using generic segmentation such as sector, size, and location of a company, given limited and varied disclosure. In addition, data collection is often inclusive of only public companies, given the reliance on obtaining ESG data from reporting.
  • ESG greenhouse gas
  • ESG data providers Because of non-standardization of company disclosure, as well as the collection of additional data from sources such as news and the media, ESG data providers often require a manual review of the data by an analyst. This has benefits in terms of capturing nuances around ESG disclosure, and it is the preferred approach for providing ESG in a traditional or associated rating, such as for providers like S&P Global and Moody’s.
  • manual evaluation of companies can also introduce bias that can result in inconsistencies and issues regarding company comparability. Manual analysis is also resource-intensive.
  • ESG data covers a broad spectrum of issues
  • emerging data collection methods including geospatial data from satellites, sensor data from the use of the industrial internet of things and the internet of things, and the application of advanced AI and ML analytics to additional datasets, will likely uncover additional and potentially more accurate modes of measuring ESG-related metrics.
  • data can be standardized through a process of normalization to allow comparing and aggregation of different metrics containing differing units. For example, 1,000 tons of carbon dioxide equivalent (tC02e) can be converted to a number between 0 and 100 depending on the included maximum and minimum values in the sample, which may be the entire universe of companies or only companies in the same industry. Metrics can be aggregated to more general themes, such as environmental performance, which can be rolled up again into an overall ESG score.
  • tC02e 1,000 tons of carbon dioxide equivalent
  • topic-specific weighting can be applied based on the importance, or materiality, of that topic to the company’s sector.
  • the Sustainable Accounting Standards Board (SASB) Materiality MapTM provides a matrix that illustrates which ESG topics are considered financially material to distinct sectors. Weighting of topics can also vary depending on preference, such as weighting diversity more heavily because it is considered of greater importance to specific stakeholders. This latter approach is more common in impact metrics and investing, which is focused more on longer-term outcomes that may yield a smaller financial performance than traditional benchmarks until later years.
  • the present document discloses a method that includes (a) receiving data indicative of an environmental (E), social (S) and governance (G) objective, and measurements of ESG components, (b) creating a set of N-grams for each ESG component, (c) searching a database, based on the set of N-grams, to obtain ESG data, and (d) generating an ESG score based on the ESG data.
  • a system that performs the method.
  • FIG. 1 is a block diagram of system for generating an ESG ranking.
  • FIG. 2 is a conceptual diagram of an ESG ranking method.
  • FIG. 3 is a conceptual block diagram of a method for big data collection and generation.
  • FIG. 4 is a flowchart for a method of web-scraping and NLP analysis.
  • FIG. 5 is a flowchart of a method of news NLP analysis.
  • FIG. 6 is a flowchart of a method for NLP and topic tagging.
  • FIG. 7 is a flowchart of a method for sentiment analysis.
  • FIG. 8 is a table of ESG rankings dataset’s topic architecture.
  • FIG. 9 is a flowchart of a high-level methodology for ESG ranking.
  • FIG. 10 is a table of example data for ESG topics of supplier engagement and environmental opportunities.
  • FIG. 11 is a table of illustrative scores for ESG themes across various data sources.
  • FIG. 12 is a table of overall ESG scores across sources.
  • FIG. 13 is a table of overall ESG factor scores that fall between thresholds that then inform the final ESG rankings.
  • FIG. 14 is a table of the keywords related to topics.
  • FIG. 15 is a table of examples of some predictors used in ESG.
  • FIG. 16 is a table of an example of execution of methods of NLP and topic and theme tagging, and sentiment analysis.
  • FIG. 17 is a table of topic weights related to a sector for gas utilities and distributors.
  • FIG. 18 is a table of an example calculation of the score for the Natural resources theme.
  • FIG. 19 is a table of an example calculation of an environment score.
  • FIG. 20 is a table of an exemplary calculation of an ESG score.
  • ESG rankings dataset that will contribute to the ESG data landscape by providing the following: [0041] (A) Wide coverage of both public and private companies based on a consistent approach. Today, there is a paucity of data on private companies, as these companies are not required to submit annual reports and filings on their performance. Where there is ESG data on private companies, it was often collected using methods that differ considerably from those of public companies. Through multiple venues, Dun & Bradstreet reports on more than 420 million public and private companies on data related to their performance and trade. This data includes many topics that are important to ESG performance and offers existing channels for additional information related to environmental and social topics. This enables wide coverage and a consistent approach for compiling the ESG rankings dataset for companies.
  • the business landscape is rapidly changing, and so should the data that describes its impact on environmental and social factors.
  • ESG data is so often reliant on publicly available reports and filings that might be refreshed on an annual basis at most, ESG data is often limited in its update frequency.
  • the ESG rankings dataset also ingests this type of data, much of its private data is gathered throughout the year on a rolling basis, is updated consistently, and can be processed quickly in order to be available to customers. For example, for the ESG rankings dataset, data is processed weekly, and updates are available monthly.
  • the ESG rankings dataset will provide decision-useful metrics across a wide range of companies. Below, there is provided more detail on the methods used to create the ESG rankings dataset.
  • an ESG rankings dataset will preferably contribute to the ESG data landscape as follows:
  • the ESG rankings dataset s topic architecture was created by referencing several of the leading ESG standards. Data is sourced, collected, and quality-checked through various processes. In preparation for analytical modeling and calculations, the data is further normalized, processed, and weighted. The outputs are various ESG-related rankings as well as overall scores. The ESG outputs are calculated to create data that is normally distributed between 1, indicating low risk or best performance, and 5, indicating high risk or worst performance.
  • the ESG rankings dataset offers a decision-useful set of metrics that can be used in multiple applications, such as supply chain management, investing, lending and credit evaluation, insurance inputs, and even sales and marketing segmentation. Aggregating a massive array of ESG-related data into manageable indicators that are decision-useful has been one of the long-term goals of the sustainability field.
  • An existing ESG rankings dataset was tested for robustness, and the testers recognized areas for refinement. These areas include (a) the focus of existing workstreams that increase data availability through more granular and broad data acquisition as well as further use of modeling, where appropriate, (b) refinement of NLP libraries and analysis to filter out “greenwashing”, and (c) harmonizing of local ESG data availability in an ESG dataset with global coverage. Developing ESG products that provide depth around specific risks or trends, such as climate impact or emerging regulations, are also part of providing a wide range of useful and valuable intelligence on the ESG metrics for public and private companies.
  • FIG. 1 is a block diagram of system, namely system 100, for generating an ESG ranking.
  • System 100 includes a computer 105 coupled to a network 145 and a storage system 125.
  • Network 145 is a data communications network.
  • Network 145 maybe aprivate network or a public network, and may include any or all of (a) a personal area network, e.g., covering a room, (b) a local area network, e.g., covering a building, (c) a campus area network, e.g., covering a campus, (d) a metropolitan area network, e.g., covering a city, (e) a wide area network, e.g., covering an area that links across metropolitan, regional, or national boundaries, (f) the Internet, or (g) a telephone network. Communications are conducted via network 145 by way of electronic signals and optical signals that propagate through a wire or optical fiber, or are transmitted and received wirelessly.
  • Computer 105 includes a processor 110, and a memory 115 that is operationally coupled to processor 110. Although computer 105 is represented herein as a standalone device, it is not limited to such, but instead can be coupled to other devices (not shown) in a distributed processing system.
  • Processor 110 is an electronic device configured of logic circuitry that responds to and executes instructions.
  • Memory 115 is a tangible, non-transitory, computer-readable storage device encoded with a computer program.
  • memory 115 stores data and instructions, i.e., program code, that are readable and executable by processor 110 for controlling operations of processor 110.
  • Memory 115 may be implemented in a random access memory (RAM), a hard drive, a read only memory (ROM), or a combination thereof.
  • RAM random access memory
  • ROM read only memory
  • One of the components of memory 115 is a program module 120.
  • Program module 120 contains instructions for controlling processor 110 to execute processes described herein.
  • module is used herein to denote a functional operation that may be embodied either as a stand-alone component or as an integrated configuration of a plurality of subordinate components.
  • program module 120 may be implemented as a single module or as a plurality of modules that operate in cooperation with one another.
  • program module 120 is described herein as being installed in memory 115, and therefore being implemented in software, it could be implemented in any of hardware (e.g., electronic circuitry), firmware, software, or a combination thereof.
  • Storage device 150 is a tangible, non-transitory, computer-readable storage device that stores program module 120 thereon. Examples of storage device 150 include (a) a read only memory, (b) an optical storage medium, (c) a hard drive, (d) a memory unit consisting of multiple parallel hard drives, (e) a universal serial bus (USB) flash drive, (f) a RAM, and (g) an electronic storage device coupled to computer 105 via network 145.
  • USB universal serial bus
  • Storage system 125 is a storage device, for example, a hard drive or a database system, on which processor 110 stores data.
  • a user 135 uses a user device 130 that is communicatively couped to network 145.
  • User device 130 includes a user interface 140.
  • User interface 140 includes an input device, such as a keyboard, speech recognition subsystem, or gesture recognition subsystem, for enabling user 135 to communicate information to and from computer 105 via network 145.
  • User interface 140 also includes an output device such as a display or a speech synthesizer and a speaker.
  • a cursor control or a touch-sensitive screen allows user 135 to utilize user interface 140 for communicating additional information and command selections to computer 105.
  • FIG. 2 is a conceptual diagram of an ESG ranking method, namely method 200, performed by system 100 on a cloud network.
  • user 135 communicates with computer 105, and more specifically processor 110, via user interface 140, and defines an objective (ESG) and measurements of its components (ESG pillars).
  • ESG objective
  • ESG pillars measurements of its components
  • processor 110 creates a set of N-grams for each component.
  • An In gram is a phrase having a quantity of N words. For example, “my black cat” is a 3-gram.
  • processor 110 performs big data collection and generation (see FIG. 3).
  • processor 110 creates component weights for each business segment through machine learning, and benchmarked with literature/ sustainability standards that are based on the importance, or materiality, of ESG components to the business segment
  • processor 110 scores a business.
  • the data collected from operation 215 and the weights created in operation 220 are used together for scoring in operation 225. It obtains missing values from a family tree (immediate parent, same industry). Override rules are utilized for blacklist and award lists.
  • ESG ranking data is stored in storage system 125.
  • FIG. 3 is a conceptual block diagram of big data collection and generation, as performed by operation 215.
  • Operation 215 receives data from data sources 305, which include data sources 310, 315, 335 and 340.
  • Data sources 310 include the world’s leading commercial data company’s clouds and 3 rd party data sources. Examples include Green List, Global Diversity List, spend data, inquiry data, Global Archive, comprehensive global database of business information, small business risk insights, CountryRisk, Risk scores (SSI/SER), and GHG Emission.
  • Data sources 315 are public data sources, which may include data in various format pictures, e.g., PDF.
  • Data sources 315 include (a) public data 320, (b) company websites 325, and (c) company reports 330.
  • Public data 320 includes data from government, e.g., SEC, and United Nations sources, and includes Form 10-K, proxy statements, annual reports, EPA, OSHA, EPLS and OF AC.
  • Company websites 325 includes text contained in ESG-related URLs under company domains, and CSR reports.
  • Data sources 335 are internet-based data sources andNGOs.
  • Data sources 340 are global news data sources, such as global news feeds from premier global news providers.
  • Operation 215 includes several subordinate operations, namely operations 350, 355, 360 and 365.
  • processor 110 receives data from data sources 310, and processes the world’s leading commercial data company data cloud, factual and derived data, and 3 rd party ESG data.
  • processor 110 receives data from data sources 315 and 335, and performs web-scraping and NLP analysis (see FIG. 4). For example, for company reports 330, processor 110 performs text NLP and image recognition on board member gender.
  • processor 110 receives data from data sources 340, and performs news NLP analysis (see FIG. 5).
  • processor 110 performs quality assurance on results of operations 350, 355 and 360.
  • Many data are missing or not available for generation of an ESG index. Such data can be derived through machine learning. Examples of such data include C02e GHG emission predictions, electricity predictions, and climate perils impacts on business performance.
  • FIG. 4 is a flowchart for a method of web-scraping and NLP analysis, as performed in operation 355.
  • processor 110 performs domain mapping for numeric identifier of a business entity.
  • processor 110 performs web scrapping, which includes:
  • the web scraping is performed on data of various formats.
  • processor 110 performs natural language processing & topic and theme tagging (see FIG. 6, which includes:
  • processor 110 performs sentiment analysis (see FIG. 7). Sentimental analysis is to analyze text for understanding the opinion expressed by it. Typically, we quantify this sentiment with a positive, negative, or neutral value.
  • processor 110 performs ESG scoring based on processed web data.
  • ESG scoring based on processed web data.
  • Topic score is calculated based on average of processed ESG data values.
  • Theme score is calculated by weighted average of corresponding topic scores.
  • ESG score of web data is then calculated by weighted average of all available topic scores.
  • FIG. 5 is a flowchart of new NLP analysis, as performed in operation 360.
  • processor 110 performs news extraction.
  • News extraction involves collection of news data pertaining to companies globally via file transfer protocol server received from premier news data provider.
  • processor 110 performs news mapping for numeric identifier of business entity thereby identifying the company corresponding to the news received.
  • processor 110 performs NLP & topic theme tagging (see FIG. 6), which includes:
  • processor 110 performs sentiment analysis (see FIG. 7).
  • processor 110 performs ESG scoring based on processed news data.
  • ESG scoring based on processed news data.
  • Topic score is calculated based on average of processed ESG data values.
  • Theme score is calculated by weighted average of corresponding topic scores.
  • FIG. 6 is a flowchart of a method 600 for NLP and topic tagging (multi-language processing), as performed in operations 415 and 515.
  • processor 110 tokenizes text data into sentences where large text data received as paragraphs/documents is split to sentences
  • processor 110 preprocesses sentences. Preprocessing involves cleaning of textual sentences by removal of special characters and other text cleaning operations.
  • processor 110 tags each sentence to E, S and G multigrams/keywords using python library for fast keyword searching for speed where the N grams obtained in operation 210 are searched within the sentences to classify them to E, S and G categories.
  • processor 110 tags each sentence to themes and topics under E, S and G dimensions based on detected E, S, G specific N grams identified within each sentence in operation 615.
  • processor 110 shortlists sentences that have at least one mention of E, S or G, and moves the output to storage system 125.
  • FIG. 7 is a flowchart of a method 700 for sentiment analysis, as performed in operations 420 and 520.
  • processor 110 loads preprocessed sentences from cloud storage location to web based integrated development environment for sentiment analysis.
  • processor 110 utilizes one or more a machine learning models such as Bidirectional Encoder Representations from Transformers (BERT) and Zero Shot to perform sentiment analysis for shortlisted sentences.
  • a machine learning models such as Bidirectional Encoder Representations from Transformers (BERT) and Zero Shot to perform sentiment analysis for shortlisted sentences.
  • Processor 110 also performs business identity resolution, which includes:
  • the ESG rankings dataset’s topic architecture was created by referencing several of the leading ESG standards, including the SASB, the Global Reporting Initiative (GRI), the Task Force on Climate-related Financial Disclosures (TCFD), the CDP (formerly the Carbon Disclosure Project), the UN SDGs, and other notable sustainability reporting frameworks. Under each of the environmental (E), social (S), and governance (G) dimensions, specific themes were described, as well as another layer of specific topics that relate to each general theme. Once this framework was established, each of the ESG themes could then be populated with hundreds of variables sourced from various datasets.
  • the ESG rankings dataset uses the SASB Sustainable Industry Classification System ® taxonomy for sector classifications.
  • this taxonomy categorizes companies into sectors and industries in accordance with a fundamental view of their business model, their resource intensity, their sustainability impacts, and their sustainability innovation potential.
  • This sector classification is superior to other such systems, such as the Global Industry Classification Standard, for improving ESG issue identification per sector segment.
  • FIG. 8 is a table of ESG rankings dataset’s topic architecture, and shows several exemplary themes and topics. In this example, there are 13 ESG themes.
  • variables are ingested and quality checked through various processes. In preparation for analytical modeling and calculations, data is further normalized, processed, and weighted. The output is various ESG-related rankings as well as an overall score.
  • FIG. 9 is a flowchart of a high-level methodology for ESG ranking.
  • Data Sourcing and Collection Data is first sourced through internal Dun & Bradstreet databases using analytical tools. This data was complemented with data from government sources (e.g., U.S. Environmental Protection Agency (EPA) compliance and environmental pollutant data), public sources (e.g., company reports and filings), news (e.g., processed through D&B Hoovers), and some third-party licensed data (e.g., aggregation of sustainability reports, GHG emissions from CDP). Companies can also directly submit additional ESG-related data through Dun & Bradstreet channels that can then be integrated into the ESG rankings dataset. The following are the examples of data sources for the ESG rankings dataset:
  • government sources e.g., U.S. Environmental Protection Agency (EPA) compliance and environmental pollutant data
  • public sources e.g., company reports and filings
  • news e.g., processed through D&B Hoovers
  • third-party licensed data e.g., aggregation of sustainability reports, GHG emissions from CDP.
  • Companies can also directly
  • topic extraction is done via NLP and deep learning. Keywords are organized in an ontology specific to the ESG domain. This is created through deep learning models such as Latent Dirichlet Allocation topic modeling, Google’s pretrained word embeddings, word2vec, and evaluations from subject experts that inform testing.
  • An ESG-BERT model is employed to detect polarity among keywords after models are trained using manually labeled sentences containing those keywords. These phrases are collected, evaluated, and organized into distinct keywords, bigrams (two keywords in one phrase), trigrams (three keywords in one phrase), and so on, that are combined across sources and averaged. Calculated averages are then normalized between -1 and 1 and mapped to an associated ESG topic.
  • a word embedding is a numerical representation of texts that capture their meanings, semantic relationships and different types of contexts in which they are used.
  • a pre-trained word embedding may be a deep learning model trained on billions of words from news articles that fits these words in a high-dimensional vector space.
  • ESG topic score determines the final ESG topic score. If an ESG topic is not considered material to that company’s sector as determined by Dun & Bradstreet’s financial analysis, then a weight of 0 (zero) is assigned. In order to calculate an ESG topic score, there must be enough data to inform the variables that cover the financially material ESG topics. ESG topic scores then inform a larger ESG theme score that informs the overall ESG ranking. There must be enough ESG-related data available to adequately populate several of the themes, for example, five of the 13 ESG themes in the table in FIG. 8. As more data is ingested and becomes available, it is likely more companies will be assigned an ESG ranking.
  • FIG. 10 is a table of example data for ESG topics of supplier engagement and environmental opportunities.
  • ESG topic scores are then aggregated using a weighted average on the theme level across the data sources to determine an overall ESG theme score.
  • FIG. 11 is a table of illustrative scores for ESG themes across various data sources.
  • ESG theme scores then roll up to the average ESG factor scores across the E, S, G, and overall ESG dimensions.
  • FIG. 12 is a table of overall ESG scores across sources.
  • the factor scores fall between distinct thresholds that then inform the final ESG rankings from 1 to 5, with 1 being the lowest risk company in the universe and 5 being the highest risk company.
  • FIG. 13 is a table of overall ESG factor scores that fall between thresholds that then inform the final ESG rankings.
  • the ESG outputs are calculated to compose a dataset that results in a normal distribution of data between 1, indicating low risk or best performance, and 5, indicating high risk or worst performance.
  • Cluster analysis on the company universe informs the number of thresholds (in this case 5), while thresholds are determined based on the standard deviation for the distribution of companies. This range is chosen in order to provide enough distinction between risk categories based on the available data that can conclusively express a risk factor on a reliable scale. For example, a company ranked 4 will have a significantly different risk profile than a company ranked 5, and even more so than a company ranked 1.
  • ESG data The main relationship of ESG data to company risk is captured when data is topically organized and aggregated to an overall metric. ESG data is also not generally rich enough to allow non-transparent calculation methods, which can occur with ML. As the dataset grows in both coverage and depth, there may be opportunities to identify specific variables that can contribute to ESG-related algorithms that benefit from ML. [00131]
  • the ESG rankings dataset is a ranking model and will adjust as the overall market improves and changes its ESG-related activities. The more companies implement management of ESG issues, the harder it will be for companies to remain in the top class.
  • the model depicts placements based on observed behaviors and not a probability of a perceived change or exposure to risk, although historical observed behaviors can have a correlation to risk events that can result in financial, reputational or operational damages.
  • Future developments of ESG data and analytics include development risk models that capture perceived change or exposure to an event.
  • ESG Self-Assessment provides an additional channel for data collection and company validation of ESG data. Any collected information goes through additional verification processes, and once processed, is added to any existing ESG data on a company.
  • the ESG Self-Assessment may include an online questionnaire composed of questions regarding ESG performance.
  • the Self-Assessment references several of the main existing sustainability frameworks (e.g., the GRI, SASB, International Integrated Reporting Council, TCFD) as well as any current and emerging ESG-related regulatory frameworks (EU Taxonomy, SFDR, TCFD, etc.). It is complementary to the ESG rankings dataset and may streamline and prioritize specific ESG topics that are financially material to companies.
  • the ESG Self-Assessment is a mechanism for further data collection and company validation of data, but it also provides identification of the topics and areas where a company may want to focus its ESG strategies, especially as it moves through differing cycles of sustainability maturity. In conjunction with the ESG rankings, the ESG Self-Assessment helps companies identify current ESG-related gaps in its strategy, reveals areas of potential improvement, and can inform the creation of ESG short- and long-term targets and goals.
  • the coverage and materiality focus of the ESG Rankings allow for myriad applications, especially wherever risk identification needs to occur across a wide range and number of companies.
  • the ESG Rankings dataset can be useful, for example, for the following positions.
  • XYZ wants to access ABC’s ESG score as per the ESG components in FIG. 8.
  • E environmental
  • S social
  • G governance
  • N-Grams are keywords that are ontology-specific to the ESG components.
  • FIG. 14 is a table of the keywords related to topics, namely, (a) waste and hazards management, and (b) land use and biodiversity.
  • operation 215 take N-grams from operation 210, and collect and generate big data (see FIG. 3).
  • Data is obtained from data sources 310, e.g., Dun & Bradstreet databases and 3rd party data.
  • FIG. 15 is a table of examples of some predictors used in ESG. Raw values of predictors are converted to a scale of -1 to 1 based on impact of predictor where -1 represents the most risk or negative impact and 1 represents the positive impact or least risk.
  • Data sources 315 e.g., public data sources - company websites, lOK/CSR/other ESG related reports
  • Data sources 335 e.g., data from highly reliable web sources that have rich ESG data pertaining to different companies.
  • Data sources 340 e.g., global news data related to companies.
  • Text data from data sources 315, 335 and 340 is collected and processed as follows.
  • operation 355 data from web domains related to data sources 315 and 335 are collected by first identifying the company domain, and then extracting the ESG-specific data present in the company’s website. (See FIG. 4, operations 405 and 410)
  • news data from data sources 340 is received from a premier news provider via file transfer protocol server, and then undergoes mapping for numeric identifier of business entity to identify the company corresponding to the news received. (See FIG. 5, operations 505 and 510.)
  • Method 600 performs NLP and topic and theme tagging (See FIG. 6). Text data collected as paragraphs/documents is split to sentence level, and then these sentences are preprocessed by removing special characters and other text cleaning operations.
  • Method 700 performs sentiment analysis (See FIG. 7).
  • the polarity/sentiment (positive, negative, neutral) of the preprocessed ESG sentences is determined using BERT/Zero shot models.
  • FIG. 16 is a table of an example of execution of methods 600 and 700, which shows the ESG theme and topic tagging of text data and arriving the ESG converted value based on polarity. Positive polarity results to a value of +1, negative polarity is assigned a value of -1, and neutral polarity is assigned to a value of 0.
  • Operation 220 creates component weights for each business segment through existing literature/standards. These topic-specific weights are based on the importance, or materiality, of that topic to the company’s sector. [00181] FIG. 17 is a table of topic weights related to a sector for gas utilities and distributors as per the literature/standards.
  • each ESG component score is calculated as follows.
  • Topic score is calculated based on average of processed data values. Some topic scores are also overridden based on Blacklists/certifications data.
  • Theme score is calculated by weighted average of corresponding topic scores. For instance, a score for a Natural resources theme is calculated.
  • FIG. 18 is a table of an example calculation of the score for the Natural resources theme.
  • Dimension score (environment/social/govemance) is obtained by weighted average of corresponding topic scores.
  • FIG. 19 is a table of an example calculation of an environment score.
  • the ESG score of a data source is then calculated by weighted average of all available topic scores.
  • FIG. 20 is a table of an exemplary calculation of an ESG score.
  • the overall ESG score ranges at a scale of -1 to 1 where -1 represents the most risk or negative impact, and 1 represents the positive impact or least risk.
  • thresholds are derived and applied accordingly for each component to assign ESG rankings/scores.
  • ESG scores are given based on nearest hierarchy within that family tree.
  • ESG fields/results will be transferred to a platform from which a user, e.g., user 135, will be able to access the ESG scores.
  • processor 110 performs operations of (a) receiving data indicative of an environmental (E), social (S) and governance (G) objective, and measurements of ESG components, (b) creating a set of N-grams for each ESG component, (c) searching a database, based on the set of N-grams, to obtain ESG data, and (d) generating an ESG score based on the ESG data.
  • E environmental
  • S social
  • G governance
  • Generating the ESG score based on the ESG data may include creating a component weight for a business segment.
  • Creating a component weight may be performed by a machine learning component.
  • Generating the ESG score may include (a) obtaining website data from a website for a business based on the ESG data, (b) natural language processing (NLP) of the website data, thus yielding a tag, (c) performing a sentiment analysis on the tag, thus yielding a sentiment, and (d) utilizing the tag and the sentiment to generate the ESG score.
  • NLP natural language processing
  • Obtaining website data may include domain mapping the business to the website, and web scrapping the website to obtain the website data.
  • Obtaining website data may also include (a) obtaining news concerning the ESG data, and (b) mapping the business to the website based on the news.
  • NLP may include (a) tokenizing text data from the website into a sentence, (b) tagging the sentence to E, S and G multigrams, (c) tagging the sentence to a theme and topic under E, S and G dimensions based on the E, S and G multigrams, and (d) shortlisting the sentence in response to the sentence having at least one E, S or G mention, thus yielding a shortlisted sentence.
  • Sentiment analysis may include (a) analyzing the shortlisted sentence utilizing a machine learning model, thus yielding an analyzed sentence, (b) tagging a polarity of the analyzed sentence, thus yielding a polarity, (c) aggregating sentiment for the business for the theme and topic based on the polarity, thus yielding aggregated data, and (d) calculating an index based on the aggregated data.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Theoretical Computer Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Physics & Mathematics (AREA)
  • Strategic Management (AREA)
  • General Physics & Mathematics (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé qui comprend (a) la réception de données indiquant un objectif environnemental (E), social (S) et de gouvernance (G), et des mesures de composants ESG, (b) la création d'un ensemble de N-grammes pour chaque composant ESG, (c) la recherche d'une base de données, sur la base de l'ensemble de N-grammes, pour obtenir des données ESG, et (d) la génération d'un score ESG sur la base des données ESG. L'invention concerne également un système qui réalise le procédé.
PCT/US2022/032134 2021-06-22 2022-06-03 Système et procédé qui classent des entreprises dans l'environnement, le social et la gouvernance (esg) WO2022271431A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US202163213497P 2021-06-22 2021-06-22
US63/213,497 2021-06-22
US202163247647P 2021-09-23 2021-09-23
US63/247,647 2021-09-23
US202263309013P 2022-02-11 2022-02-11
US63/309,013 2022-02-11

Publications (1)

Publication Number Publication Date
WO2022271431A1 true WO2022271431A1 (fr) 2022-12-29

Family

ID=84544692

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2022/032134 WO2022271431A1 (fr) 2021-06-22 2022-06-03 Système et procédé qui classent des entreprises dans l'environnement, le social et la gouvernance (esg)

Country Status (1)

Country Link
WO (1) WO2022271431A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117370876A (zh) * 2023-10-24 2024-01-09 重庆设计集团有限公司城市建设策略研究院 基于多源数据融合的esg指数评价系统、方法、存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120316916A1 (en) * 2009-12-01 2012-12-13 Andrews Sarah L Methods and systems for generating corporate green score using social media sourced data and sentiment analysis
US20190333078A1 (en) * 2017-01-13 2019-10-31 TruValue Labs, Inc. Methods of assessing long-term indicators of sentiment
US20190362427A1 (en) * 2018-05-23 2019-11-28 Panagora Asset Management, Inc System and method for constructing optimized esg investment portfolios

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120316916A1 (en) * 2009-12-01 2012-12-13 Andrews Sarah L Methods and systems for generating corporate green score using social media sourced data and sentiment analysis
US20190333078A1 (en) * 2017-01-13 2019-10-31 TruValue Labs, Inc. Methods of assessing long-term indicators of sentiment
US20190362427A1 (en) * 2018-05-23 2019-11-28 Panagora Asset Management, Inc System and method for constructing optimized esg investment portfolios

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117370876A (zh) * 2023-10-24 2024-01-09 重庆设计集团有限公司城市建设策略研究院 基于多源数据融合的esg指数评价系统、方法、存储介质

Similar Documents

Publication Publication Date Title
US20220343433A1 (en) System and method that rank businesses in environmental, social and governance (esg)
US11257161B2 (en) Methods and systems for predicting market behavior based on news and sentiment analysis
US11941714B2 (en) Analysis of intellectual-property data in relation to products and services
US8458065B1 (en) System and methods for content-based financial database indexing, searching, analysis, and processing
US20240221098A1 (en) Analysis Of Intellectual-Property Data In Relation To Products And Services
US20120316916A1 (en) Methods and systems for generating corporate green score using social media sourced data and sentiment analysis
US20120296845A1 (en) Methods and systems for generating composite index using social media sourced data and sentiment analysis
US11803927B2 (en) Analysis of intellectual-property data in relation to products and services
US11348195B2 (en) Analysis of intellectual-property data in relation to products and services
US20210004918A1 (en) Analysis Of Intellectual-Property Data In Relation To Products And Services
Nissim Big data, accounting information, and valuation
US20240346531A1 (en) Systems and methods for business analytics model scoring and selection
CN114139539A (zh) 企业社会责任指标量化方法、系统及应用
Lu et al. The effects and applicability of financial media reports on corporate default ratings
Wu Text-based measure of supply chain risk exposure
CN114303140A (zh) 与产品和服务相关的知识产权数据分析
Choi et al. Exploring the deep neural network model’s potential to estimate abnormal audit fees
Lo et al. Do polluting firms suffer long term? Can government use data‐driven inspection policies to catch polluters?
Hafeez et al. Looking beyond the financial numbers: The relationship between macroeconomic indicators and the likelihood of financial distress
Abaidoo et al. Environmental sustainability risk, institutional effectiveness and urbanization
WO2022271431A1 (fr) Système et procédé qui classent des entreprises dans l'environnement, le social et la gouvernance (esg)
US20230394582A1 (en) User interface for guiding actions for desired impact
Kim et al. Do SEC filings indicate any trends? Evidence from the sentiment distribution of forms 10-K and 10-Q with FinBERT
CA3160715A1 (fr) Systemes et procedes de notation et de selection de modele d'analyse commerciale
Lee et al. Machine learning approach for carbon disclosure in the Korean market: The role of environmental performance

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22828990

Country of ref document: EP

Kind code of ref document: A1

DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22828990

Country of ref document: EP

Kind code of ref document: A1