Case (1 of 2): An Overview of the S&P500

Chapter 17.

S&P 500

The S&P 500, also called the Standard & Poor’s 500, is a stock market index that tracks the performance of 500 major publicly traded companies listed on U.S. stock exchanges. It serves as a widely accepted benchmark for assessing the overall health and performance of the U.S. stock market.

S&P Dow Jones Indices, a division of S&P Global, is responsible for maintaining the index. The selection of companies included in the S&P 500 is determined by a committee, considering factors such as market capitalization, liquidity, and industry representation.

The S&P is a float-weighted index, meaning the market capitalizations of the companies in the index are adjusted by the number of shares available for public trading. [1]

The performance of the S&P 500 is frequently used to gauge the broader stock market and is commonly referenced by investors, analysts, and financial media. It provides a snapshot of how large-cap U.S. stocks are faring and is considered a reliable indicator of overall market sentiment.

Aside: Typically, the S&P 500 index consists of 500 stocks. However, in reality, there are actually 503 stocks included. This discrepancy arises because three of the listed companies have multiple share classes, and each class is considered a separate stock that needs to be included in the index. [1]

Strengths:

  1. Diverse Representation: The S&P 500 isn’t fixated on a single industry. From technology to healthcare, it offers a panoramic view of various economic sectors, making it an inclusive representation of the U.S. corporate sector.

  2. Benchmark for Investors: For many fund managers, outperforming the S&P 500 stands as a golden standard. It’s a yardstick, establishing it as a critical touchstone for gauging investment success.

  3. Liquidity and Visibility: Constituent companies enjoy high liquidity and are subject to rigorous screening processes, ensuring that the index represents financially viable entities.

Critiques:

  1. Market Capitalization Weighting: The index is weighted by market capitalization, meaning companies with higher market values have a more pronounced effect on its performance. Critics argue this approach can skew perceptions, especially during market bubbles when certain sectors are overvalued.

  2. Exclusivity: Despite its broad purview, 500 companies cannot encapsulate the entire U.S. economy. Many sectors, especially emerging industries or smaller businesses, might not be adequately represented.

  3. Potential for Complacency: The prominence of the S&P 500 has led many investors to adopt passive investment strategies, tracking the index rather than actively managing portfolios. Detractors argue this might lead to market inefficiencies and reduced capital allocation efficacy.

While the S&P 500 remains an influential and pivotal tool for investors, its dominance prompts a double-edged sword of advantages and critiques. In a constantly evolving economic landscape, understanding both its power and limitations is essential for informed financial decision-making. [2]

The broad purpose of this Case Study is to review and analyze the different sectors and stocks within the S&P500.

S&P 500 Data

Load some useful R packages

# Load the required libraries, suppressing annoying startup messages
library(dplyr, quietly = TRUE, warn.conflicts = FALSE) # data manipulation
library(tibble, quietly = TRUE, warn.conflicts = FALSE) # data manipulation
library(ggplot2, quietly = TRUE, warn.conflicts = FALSE) # data visualization
library(ggpubr, quietly = TRUE, warn.conflicts = FALSE) # data visualization

library(gsheet, quietly = TRUE, warn.conflicts = FALSE) # Google Sheets
library(rmarkdown, quietly = TRUE, warn.conflicts = FALSE) # writing
library(knitr, quietly = TRUE, warn.conflicts = FALSE) # tables
library(kableExtra, quietly = TRUE, warn.conflicts = FALSE) # tables
library(scales)  # For formatting currency

Read the S&P500 data from a Google Sheet into a tibble

  1. We will analyze a real-world, recent dataset containing information about the S&P500 stocks, sourced from TradingView.com. [3]

  2. The dataset is located in a Google Sheet and periodically updated.

  3. The complete URL of the Google Sheet that has the data is

    https://docs.google.com/spreadsheets/d/14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ/

  4. Its Google Sheet ID is: 14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ.

Loading the data into R

  1. We can use the function gsheet2tbl in package gsheet to read the Google Sheet into a tibble , as demonstrated in the following code.
# Read S&P500 stock data present in a Google Sheet.
library(gsheet)
prefix <- "https://docs.google.com/spreadsheets/d/"
sheetID <- "14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ"
url500 <- paste(prefix,sheetID) # Form the URL to connect to
sp500Data <- gsheet2tbl(url500) # Read it into a tibble 
  1. Note: This data is current, as of Fri, Jan 5, 2024

S&P Global Industry Classification Standard (GICS®)

  1. In this case study, we will classify and analyze the S&P 500 stocks based on the GICS standard!
  2. The Global Industry Classification Standard (GICS®) was developed in 1999 by S&P Dow Jones Indices and MSCI. The GICS methodology aims to enhance the investment research and asset management process for financial professionals worldwide. The GICS methodology has been widely accepted as an industry analysis framework for investment research, portfolio management and asset allocation. [4]
  3. The GICS classification consists of 11 sectors, – {Communication Services, Consumer Discretionary, Consumer Staples, Energy, Financials, Health Care, Industrials, Information Technology, Materials, Real Estate, Utilities}. The classification of each stock in the S&P 500 according to GICS is available at the following Google Sheet:

https://docs.google.com/spreadsheets/d/1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgk/

  1. For this file, the Google Sheet ID is 1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgkand we read this classification data into a tibble, we name gics, using similar code.
# Read GICS classificaiton of S&P 500 stocks from a Google Sheet.
library(gsheet)
prefix2 <- "https://docs.google.com/spreadsheets/d/"
sheetID2 <- "1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgk"
urlgics <- paste(prefix2, sheetID2) # Form the URL to connect to
gics <- gsheet2tbl(urlgics) # Read it into a tibble called gics
  1. Next, we join the two tibbles, using “Stock” as the key and name our joint tibble sp500, as follows.
# Merging dataframes
sp500 <- merge(sp500Data, 
               gics , 
               id = "Stock")

Review the S&P 500 data

  1. The data corresponds to 3 companies that are part of the S&P500 and includes 40 data columns, as of Fri, Jan 5, 2024
dim(sp500)
[1]  3 40
  1. The first ten stocks in the S&P500 data, their GICS Sector and their recent prices are as follows:
sp500 %>%
  select(Stock, Description, GICSSector) %>%
  head(10) %>%
  kable("html", 
        caption = "The first 10 companies in the S&P500") %>% 
  kable_styling()
The first 10 companies in the S&P500
Stock Description GICSSector
AMCR Amcor plc Materials
F Ford Motor Company Consumer Discretionary
VTRS Viatris Inc. Health Care
  1. Data Columns
  • The data comprises of the following 40 columns:
colnames(sp500) 
 [1] "Stock"                                  
 [2] "Date"                                   
 [3] "Description"                            
 [4] "Sector"                                 
 [5] "Industry"                               
 [6] "Market Capitalization"                  
 [7] "Price"                                  
 [8] "52 Week Low"                            
 [9] "52 Week High"                           
[10] "Return on Equity (TTM)"                 
[11] "Return on Assets (TTM)"                 
[12] "Return on Invested Capital (TTM)"       
[13] "Gross Margin (TTM)"                     
[14] "Operating Margin (TTM)"                 
[15] "Net Margin (TTM)"                       
[16] "Price to Earnings Ratio (TTM)"          
[17] "Price to Book (FY)"                     
[18] "Enterprise Value/EBITDA (TTM)"          
[19] "EBITDA (TTM)"                           
[20] "EPS Diluted (TTM)"                      
[21] "EBITDA (TTM YoY Growth)"                
[22] "EBITDA (Quarterly YoY Growth)"          
[23] "EPS Diluted (TTM YoY Growth)"           
[24] "EPS Diluted (Quarterly YoY Growth)"     
[25] "Price to Free Cash Flow (TTM)"          
[26] "Free Cash Flow (TTM YoY Growth)"        
[27] "Free Cash Flow (Quarterly YoY Growth)"  
[28] "Debt to Equity Ratio (MRQ)"             
[29] "Current Ratio (MRQ)"                    
[30] "Quick Ratio (MRQ)"                      
[31] "Dividend Yield Forward"                 
[32] "Dividends per share (Annual YoY Growth)"
[33] "Price to Sales (FY)"                    
[34] "Revenue (TTM YoY Growth)"               
[35] "Revenue (Quarterly YoY Growth)"         
[36] "Technical Rating"                       
[37] "Index"                                  
[38] "Security"                               
[39] "GICSSector"                             
[40] "GICSSubIndustry"                        
  • The names of the data columns are self-explanatory. The Financial terms are explained in depth on multiple external websites such as www.Investopedia.com

Rename Data Columns

  1. The names of the data columns are lengthy and confusing. We will rename the data columns to make it easier to work with the data.
# Define a mapping of new column names
new_names <- c(
  "Stock", "Date", "StockName", "Sector", "Industry", 
  "MarketCap", "Price", "Low52Wk", "High52Wk", 
  "ROE", "ROA", "ROIC", "GrossMargin", 
  "OperatingMargin", "NetMargin", "PE", 
  "PB", "EVEBITDA", "EBITDA", "EPS", 
  "EBITDA_YOY", "EBITDA_QYOY", "EPS_YOY", 
  "EPS_QYOY", "PFCF", "FCF", 
  "FCF_QYOY", "DebtToEquity", "CurrentRatio", 
  "QuickRatio", "DividendYield", 
  "DividendsPerShare_YOY", "PS", 
  "Revenue_YOY", "Revenue_QYOY", "Rating",
  "Security", "GICSSector", "GICSSubIndustry"
)
# Rename the columns using the new_names vector
colnames(sp500)<-new_names
  1. We review the column names again after renaming them, using the colnames() function.
colnames(sp500)
 [1] "Stock"                 "Date"                  "StockName"            
 [4] "Sector"                "Industry"              "MarketCap"            
 [7] "Price"                 "Low52Wk"               "High52Wk"             
[10] "ROE"                   "ROA"                   "ROIC"                 
[13] "GrossMargin"           "OperatingMargin"       "NetMargin"            
[16] "PE"                    "PB"                    "EVEBITDA"             
[19] "EBITDA"                "EPS"                   "EBITDA_YOY"           
[22] "EBITDA_QYOY"           "EPS_YOY"               "EPS_QYOY"             
[25] "PFCF"                  "FCF"                   "FCF_QYOY"             
[28] "DebtToEquity"          "CurrentRatio"          "QuickRatio"           
[31] "DividendYield"         "DividendsPerShare_YOY" "PS"                   
[34] "Revenue_YOY"           "Revenue_QYOY"          "Rating"               
[37] "Security"              "GICSSector"            "GICSSubIndustry"      
[40] NA                     

Understand the Data Columns

  1. Our next goal is to gain a deeper understanding of what the data columns mean. We reorganize the column names into eight tables, labeled Table 1a, 1b.. 1h.
  1. The column names described in Table 1a. concern basic Company Information of each stock.
Table 1a: Data Columns giving basic Company Information
ColumnName Description
Stock Stock Ticker (e.g. AAL)
Date Date (e.g. "7/15/2023")
StockName Name of the company (e.g "American Airlines Group, Inc.")
GICSSector Sector, as per GICS Classification
GICSSubIndustry Sub-Industry, as per GICS Classification
MarketCap Market capitalization of the company
Price Recent Stock Price
  1. The column names described in Table 1b. are related to Technical Analysis, including the 52-Week High and Low prices.
Table 1b: Data Columns related to Pricing and Technical Analysis
ColumnName Description
Low52Wk 52-Week Low Price
High52Wk 52-Week High Price
Rating Technical Rating
  1. The column names described in Table 1c. are related to the Profitability of each stock.
Table 1c: Data Columns related to Profitability
ColumnName Description
ROE Return on Equity
ROA Return on Assets
ROIC Return on Invested Capital
GrossMargin Gross Profit Margin
OperatingMargin Operating Profit Margin
NetMargin Net Profit Margin
  1. The column names described in Table 1d are related to the Earnings of each stock.
Table 1d: Data Columns related to Earnings
ColumnName Description
PE Price-to-Earnings Ratio
PB Price-to-Book Ratio
EVEBITDA Enterprise Value to EBITDA Ratio
EBITDA EBITDA
EPS Earnings per Share
EBITDA_YOY EBITDA Year-over-Year Growth
EBITDA_QYOY EBITDA Quarterly Year-over-Year Growth
EPS_YOY EPS Year-over-Year Growth
EPS_QYOY EPS Quarterly Year-over-Year Growth
  1. The column names described in Table 1e are related to the Free Cash Flow of each stock.
Table 1e: Data Columns related to Free Cash Flow
ColumnName Description
PFCF Price-to-Free Cash Flow
FCF Free Cash Flow
FCF_QYOY Free Cash Flow Quarterly Year-over-Year Growth
  1. The column names described in Table 1f concern the Liquidity of each stock.
Table 1f: Data Columns related to Liquidiy
ColumnName Description
DebtToEquity Debt-to-Equity Ratio
CurrentRatio Current Ratio
QuickRatio Quick Ratio
  1. The column names described in Table 1g are related to the Revenue of each stock.
Table 1g: Data Columns related to Revenue
ColumnName Description
PS Price-to-Sales Ratio
Revenue_YOY Revenue Year-over-Year Growth
Revenue_QYOY Revenue Quarterly Year-over-Year Growth
  1. The column names described in Table 1h are related to the Dividends of each stock.
Table 1h: Data Columns related to Dividends
ColumnName Description
DividendYield Dividend Yield
DividendsPerShare_YOY Annual Dividends per Share Year-over-Year Growth

Stock Prices, 52-Week Low, High; Market Cap in Billions

We want to analyze stock prices relative to their 52 Week Low and 52 Week High respectively, to understand their relative price attractiveness.

Hence, a new column named Low52WkPerc is being added. The column contains the percentage change between the current price (Price) and its 52-week low (Low52Wk). The formula used is: \[Low52WkPerc = \frac{(CurrentPrice - 52WeekLow)*100}{52WeekLow}\]

Another column named High52WkPerc represents the percentage change between the 52-week high (High52Wk) and the current price (Price). We round off the data to two decimal places for clarity.

References

S&P 500

[1] https://www.investopedia.com/terms/s/sp500.asp

[2] S&P Global: S&P Global. (n.d.). S&P 500. Retrieved September 14, 2023, from https://www.spglobal.com/spdji/en/indices/equity/sp-500/

MarketWatch: MarketWatch. (n.d.). S&P 500 Index. Retrieved September 14, 2023, from https://www.marketwatch.com/investing/index/spx

Bloomberg: Bloomberg. (n.d.). S&P 500 Index (SPX:IND). Retrieved September 14, 2023, from https://www.bloomberg.com/quote/SPX:IND

[3] TradingView.com https://www.tradingview.com/screener/

[4] GICS: Global Industry Classification Standard: https://www.spglobal.com/spdji/en/landing/topic/gics/