Case (1 of 2): An Overview of the S&P500

Chapter 17.

S&P 500

The S&P 500, also called the Standard & Poor’s 500, is a stock market index that tracks the performance of 500 major publicly traded companies listed on U.S. stock exchanges. It serves as a widely accepted benchmark for assessing the overall health and performance of the U.S. stock market.

S&P Dow Jones Indices, a division of S&P Global, is responsible for maintaining the index. The selection of companies included in the S&P 500 is determined by a committee, considering factors such as market capitalization, liquidity, and industry representation.

The S&P is a float-weighted index, meaning the market capitalizations of the companies in the index are adjusted by the number of shares available for public trading. [1]

The performance of the S&P 500 is frequently used to gauge the broader stock market and is commonly referenced by investors, analysts, and financial media. It provides a snapshot of how large-cap U.S. stocks are faring and is considered a reliable indicator of overall market sentiment.

Aside: Typically, the S&P 500 index consists of 500 stocks. However, in reality, there are actually 503 stocks included. This discrepancy arises because three of the listed companies have multiple share classes, and each class is considered a separate stock that needs to be included in the index. [1]

Strengths:

Diverse Representation: The S&P 500 isn’t fixated on a single industry. From technology to healthcare, it offers a panoramic view of various economic sectors, making it an inclusive representation of the U.S. corporate sector.
Benchmark for Investors: For many fund managers, outperforming the S&P 500 stands as a golden standard. It’s a yardstick, establishing it as a critical touchstone for gauging investment success.
Liquidity and Visibility: Constituent companies enjoy high liquidity and are subject to rigorous screening processes, ensuring that the index represents financially viable entities.

Critiques:

Market Capitalization Weighting: The index is weighted by market capitalization, meaning companies with higher market values have a more pronounced effect on its performance. Critics argue this approach can skew perceptions, especially during market bubbles when certain sectors are overvalued.
Exclusivity: Despite its broad purview, 500 companies cannot encapsulate the entire U.S. economy. Many sectors, especially emerging industries or smaller businesses, might not be adequately represented.
Potential for Complacency: The prominence of the S&P 500 has led many investors to adopt passive investment strategies, tracking the index rather than actively managing portfolios. Detractors argue this might lead to market inefficiencies and reduced capital allocation efficacy.

While the S&P 500 remains an influential and pivotal tool for investors, its dominance prompts a double-edged sword of advantages and critiques. In a constantly evolving economic landscape, understanding both its power and limitations is essential for informed financial decision-making. [2]

The broad purpose of this Case Study is to review and analyze the different sectors and stocks within the S&P500.

S&P 500 Data

Load some useful R packages

# Load the required libraries, suppressing annoying startup messages
library(dplyr, quietly = TRUE, warn.conflicts = FALSE) # data manipulation
library(tibble, quietly = TRUE, warn.conflicts = FALSE) # data manipulation
library(ggplot2, quietly = TRUE, warn.conflicts = FALSE) # data visualization
library(ggpubr, quietly = TRUE, warn.conflicts = FALSE) # data visualization

library(gsheet, quietly = TRUE, warn.conflicts = FALSE) # Google Sheets
library(rmarkdown, quietly = TRUE, warn.conflicts = FALSE) # writing
library(knitr, quietly = TRUE, warn.conflicts = FALSE) # tables
library(kableExtra, quietly = TRUE, warn.conflicts = FALSE) # tables
library(scales)  # For formatting currency

Read the S&P500 data from a Google Sheet into a tibble

We will analyze a real-world, recent dataset containing information about the S&P500 stocks, sourced from TradingView.com. [3]
The dataset is located in a Google Sheet and periodically updated.
The complete URL of the Google Sheet that has the data is

https://docs.google.com/spreadsheets/d/14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ/
Its Google Sheet ID is: 14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ.

Loading the data into R

We can use the function gsheet2tbl in package gsheet to read the Google Sheet into a tibble , as demonstrated in the following code.

# Read S&P500 stock data present in a Google Sheet.
library(gsheet)
prefix <- "https://docs.google.com/spreadsheets/d/"
sheetID <- "14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ"
url500 <- paste(prefix,sheetID) # Form the URL to connect to
sp500Data <- gsheet2tbl(url500) # Read it into a tibble

Note: This data is current, as of Fri, Jan 5, 2024

S&P Global Industry Classification Standard (GICS^®)

In this case study, we will classify and analyze the S&P 500 stocks based on the GICS standard!
The Global Industry Classification Standard (GICS^®) was developed in 1999 by S&P Dow Jones Indices and MSCI. The GICS methodology aims to enhance the investment research and asset management process for financial professionals worldwide. The GICS methodology has been widely accepted as an industry analysis framework for investment research, portfolio management and asset allocation. [4]
The GICS classification consists of 11 sectors, – {Communication Services, Consumer Discretionary, Consumer Staples, Energy, Financials, Health Care, Industrials, Information Technology, Materials, Real Estate, Utilities}. The classification of each stock in the S&P 500 according to GICS is available at the following Google Sheet:

https://docs.google.com/spreadsheets/d/1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgk/

For this file, the Google Sheet ID is 1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgkand we read this classification data into a tibble, we name gics, using similar code.

# Read GICS classificaiton of S&P 500 stocks from a Google Sheet.
library(gsheet)
prefix2 <- "https://docs.google.com/spreadsheets/d/"
sheetID2 <- "1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgk"
urlgics <- paste(prefix2, sheetID2) # Form the URL to connect to
gics <- gsheet2tbl(urlgics) # Read it into a tibble called gics

Next, we join the two tibbles, using “Stock” as the key and name our joint tibble sp500, as follows.

# Merging dataframes
sp500 <- merge(sp500Data, 
               gics , 
               id = "Stock")

Review the S&P 500 data

The data corresponds to 3 companies that are part of the S&P500 and includes 40 data columns, as of Fri, Jan 5, 2024

dim(sp500)

[1]  3 40

The first ten stocks in the S&P500 data, their GICS Sector and their recent prices are as follows:

sp500 %>%
  select(Stock, Description, GICSSector) %>%
  head(10) %>%
  kable("html", 
        caption = "The first 10 companies in the S&P500") %>% 
  kable_styling()

The first 10 companies in the S&P500
Stock	Description	GICSSector
AMCR	Amcor plc	Materials
F	Ford Motor Company	Consumer Discretionary
VTRS	Viatris Inc.	Health Care

Data Columns

The data comprises of the following 40 columns:

colnames(sp500)

 [1] "Stock"                                  
 [2] "Date"                                   
 [3] "Description"                            
 [4] "Sector"                                 
 [5] "Industry"                               
 [6] "Market Capitalization"                  
 [7] "Price"                                  
 [8] "52 Week Low"                            
 [9] "52 Week High"                           
[10] "Return on Equity (TTM)"                 
[11] "Return on Assets (TTM)"                 
[12] "Return on Invested Capital (TTM)"       
[13] "Gross Margin (TTM)"                     
[14] "Operating Margin (TTM)"                 
[15] "Net Margin (TTM)"                       
[16] "Price to Earnings Ratio (TTM)"          
[17] "Price to Book (FY)"                     
[18] "Enterprise Value/EBITDA (TTM)"          
[19] "EBITDA (TTM)"                           
[20] "EPS Diluted (TTM)"                      
[21] "EBITDA (TTM YoY Growth)"                
[22] "EBITDA (Quarterly YoY Growth)"          
[23] "EPS Diluted (TTM YoY Growth)"           
[24] "EPS Diluted (Quarterly YoY Growth)"     
[25] "Price to Free Cash Flow (TTM)"          
[26] "Free Cash Flow (TTM YoY Growth)"        
[27] "Free Cash Flow (Quarterly YoY Growth)"  
[28] "Debt to Equity Ratio (MRQ)"             
[29] "Current Ratio (MRQ)"                    
[30] "Quick Ratio (MRQ)"                      
[31] "Dividend Yield Forward"                 
[32] "Dividends per share (Annual YoY Growth)"
[33] "Price to Sales (FY)"                    
[34] "Revenue (TTM YoY Growth)"               
[35] "Revenue (Quarterly YoY Growth)"         
[36] "Technical Rating"                       
[37] "Index"                                  
[38] "Security"                               
[39] "GICSSector"                             
[40] "GICSSubIndustry"

The names of the data columns are self-explanatory. The Financial terms are explained in depth on multiple external websites such as www.Investopedia.com

Rename Data Columns

The names of the data columns are lengthy and confusing. We will rename the data columns to make it easier to work with the data.

# Define a mapping of new column names
new_names <- c(
  "Stock", "Date", "StockName", "Sector", "Industry", 
  "MarketCap", "Price", "Low52Wk", "High52Wk", 
  "ROE", "ROA", "ROIC", "GrossMargin", 
  "OperatingMargin", "NetMargin", "PE", 
  "PB", "EVEBITDA", "EBITDA", "EPS", 
  "EBITDA_YOY", "EBITDA_QYOY", "EPS_YOY", 
  "EPS_QYOY", "PFCF", "FCF", 
  "FCF_QYOY", "DebtToEquity", "CurrentRatio", 
  "QuickRatio", "DividendYield", 
  "DividendsPerShare_YOY", "PS", 
  "Revenue_YOY", "Revenue_QYOY", "Rating",
  "Security", "GICSSector", "GICSSubIndustry"
)
# Rename the columns using the new_names vector
colnames(sp500)<-new_names

We review the column names again after renaming them, using the colnames() function.

colnames(sp500)

 [1] "Stock"                 "Date"                  "StockName"            
 [4] "Sector"                "Industry"              "MarketCap"            
 [7] "Price"                 "Low52Wk"               "High52Wk"             
[10] "ROE"                   "ROA"                   "ROIC"                 
[13] "GrossMargin"           "OperatingMargin"       "NetMargin"            
[16] "PE"                    "PB"                    "EVEBITDA"             
[19] "EBITDA"                "EPS"                   "EBITDA_YOY"           
[22] "EBITDA_QYOY"           "EPS_YOY"               "EPS_QYOY"             
[25] "PFCF"                  "FCF"                   "FCF_QYOY"             
[28] "DebtToEquity"          "CurrentRatio"          "QuickRatio"           
[31] "DividendYield"         "DividendsPerShare_YOY" "PS"                   
[34] "Revenue_YOY"           "Revenue_QYOY"          "Rating"               
[37] "Security"              "GICSSector"            "GICSSubIndustry"      
[40] NA

Understand the Data Columns

Our next goal is to gain a deeper understanding of what the data columns mean. We reorganize the column names into eight tables, labeled Table 1a, 1b.. 1h.

The column names described in Table 1a. concern basic Company Information of each stock.

Table 1a: Data Columns giving basic Company Information
ColumnName	Description
Stock	Stock Ticker (e.g. AAL)
Date	Date (e.g. "7/15/2023")
StockName	Name of the company (e.g "American Airlines Group, Inc.")
GICSSector	Sector, as per GICS Classification
GICSSubIndustry	Sub-Industry, as per GICS Classification
MarketCap	Market capitalization of the company
Price	Recent Stock Price

The column names described in Table 1b. are related to Technical Analysis, including the 52-Week High and Low prices.

Table 1b: Data Columns related to Pricing and Technical Analysis
ColumnName	Description
Low52Wk	52-Week Low Price
High52Wk	52-Week High Price
Rating	Technical Rating

The column names described in Table 1c. are related to the Profitability of each stock.

Table 1c: Data Columns related to Profitability
ColumnName	Description
ROE	Return on Equity
ROA	Return on Assets
ROIC	Return on Invested Capital
GrossMargin	Gross Profit Margin
OperatingMargin	Operating Profit Margin
NetMargin	Net Profit Margin

The column names described in Table 1d are related to the Earnings of each stock.

Table 1d: Data Columns related to Earnings
ColumnName	Description
PE	Price-to-Earnings Ratio
PB	Price-to-Book Ratio
EVEBITDA	Enterprise Value to EBITDA Ratio
EBITDA	EBITDA
EPS	Earnings per Share
EBITDA_YOY	EBITDA Year-over-Year Growth
EBITDA_QYOY	EBITDA Quarterly Year-over-Year Growth
EPS_YOY	EPS Year-over-Year Growth
EPS_QYOY	EPS Quarterly Year-over-Year Growth

The column names described in Table 1e are related to the Free Cash Flow of each stock.

Table 1e: Data Columns related to Free Cash Flow
ColumnName	Description
PFCF	Price-to-Free Cash Flow
FCF	Free Cash Flow
FCF_QYOY	Free Cash Flow Quarterly Year-over-Year Growth

The column names described in Table 1f concern the Liquidity of each stock.

Table 1f: Data Columns related to Liquidiy
ColumnName	Description
DebtToEquity	Debt-to-Equity Ratio
CurrentRatio	Current Ratio
QuickRatio	Quick Ratio

The column names described in Table 1g are related to the Revenue of each stock.

Table 1g: Data Columns related to Revenue
ColumnName	Description
PS	Price-to-Sales Ratio
Revenue_YOY	Revenue Year-over-Year Growth
Revenue_QYOY	Revenue Quarterly Year-over-Year Growth

The column names described in Table 1h are related to the Dividends of each stock.

Table 1h: Data Columns related to Dividends
ColumnName	Description
DividendYield	Dividend Yield
DividendsPerShare_YOY	Annual Dividends per Share Year-over-Year Growth

Stock Prices, 52-Week Low, High; Market Cap in Billions

We want to analyze stock prices relative to their 52 Week Low and 52 Week High respectively, to understand their relative price attractiveness.

Hence, a new column named Low52WkPerc is being added. The column contains the percentage change between the current price (Price) and its 52-week low (Low52Wk). The formula used is: \[Low52WkPerc = \frac{(CurrentPrice - 52WeekLow)*100}{52WeekLow}\]

Another column named High52WkPerc represents the percentage change between the 52-week high (High52Wk) and the current price (Price). We round off the data to two decimal places for clarity.

References

S&P 500

[1] https://www.investopedia.com/terms/s/sp500.asp

[2] S&P Global: S&P Global. (n.d.). S&P 500. Retrieved September 14, 2023, from https://www.spglobal.com/spdji/en/indices/equity/sp-500/

MarketWatch: MarketWatch. (n.d.). S&P 500 Index. Retrieved September 14, 2023, from https://www.marketwatch.com/investing/index/spx

Bloomberg: Bloomberg. (n.d.). S&P 500 Index (SPX:IND). Retrieved September 14, 2023, from https://www.bloomberg.com/quote/SPX:IND

[3] TradingView.com https://www.tradingview.com/screener/

[4] GICS: Global Industry Classification Standard: https://www.spglobal.com/spdji/en/landing/topic/gics/

Other Formats