# Load the required libraries, suppressing annoying startup messages
library(dplyr, quietly = TRUE, warn.conflicts = FALSE) # data manipulation
library(tibble, quietly = TRUE, warn.conflicts = FALSE) # data manipulation
library(ggplot2, quietly = TRUE, warn.conflicts = FALSE) # data visualization
library(ggpubr, quietly = TRUE, warn.conflicts = FALSE) # data visualization
library(gsheet, quietly = TRUE, warn.conflicts = FALSE) # Google Sheets
library(rmarkdown, quietly = TRUE, warn.conflicts = FALSE) # writing
library(knitr, quietly = TRUE, warn.conflicts = FALSE) # tables
library(kableExtra, quietly = TRUE, warn.conflicts = FALSE) # tables
library(scales) # For formatting currency
Case (1 of 2): An Overview of the S&P500
Chapter 17.
S&P 500
The S&P 500, also called the Standard & Poor’s 500, is a stock market index that tracks the performance of 500 major publicly traded companies listed on U.S. stock exchanges. It serves as a widely accepted benchmark for assessing the overall health and performance of the U.S. stock market.
S&P Dow Jones Indices, a division of S&P Global, is responsible for maintaining the index. The selection of companies included in the S&P 500 is determined by a committee, considering factors such as market capitalization, liquidity, and industry representation.
The S&P is a float-weighted index, meaning the market capitalizations of the companies in the index are adjusted by the number of shares available for public trading. [1]
The performance of the S&P 500 is frequently used to gauge the broader stock market and is commonly referenced by investors, analysts, and financial media. It provides a snapshot of how large-cap U.S. stocks are faring and is considered a reliable indicator of overall market sentiment.
Aside: Typically, the S&P 500 index consists of 500 stocks. However, in reality, there are actually 503 stocks included. This discrepancy arises because three of the listed companies have multiple share classes, and each class is considered a separate stock that needs to be included in the index. [1]
Strengths:
Diverse Representation: The S&P 500 isn’t fixated on a single industry. From technology to healthcare, it offers a panoramic view of various economic sectors, making it an inclusive representation of the U.S. corporate sector.
Benchmark for Investors: For many fund managers, outperforming the S&P 500 stands as a golden standard. It’s a yardstick, establishing it as a critical touchstone for gauging investment success.
Liquidity and Visibility: Constituent companies enjoy high liquidity and are subject to rigorous screening processes, ensuring that the index represents financially viable entities.
Critiques:
Market Capitalization Weighting: The index is weighted by market capitalization, meaning companies with higher market values have a more pronounced effect on its performance. Critics argue this approach can skew perceptions, especially during market bubbles when certain sectors are overvalued.
Exclusivity: Despite its broad purview, 500 companies cannot encapsulate the entire U.S. economy. Many sectors, especially emerging industries or smaller businesses, might not be adequately represented.
Potential for Complacency: The prominence of the S&P 500 has led many investors to adopt passive investment strategies, tracking the index rather than actively managing portfolios. Detractors argue this might lead to market inefficiencies and reduced capital allocation efficacy.
While the S&P 500 remains an influential and pivotal tool for investors, its dominance prompts a double-edged sword of advantages and critiques. In a constantly evolving economic landscape, understanding both its power and limitations is essential for informed financial decision-making. [2]
The broad purpose of this Case Study is to review and analyze the different sectors and stocks within the S&P500.
S&P 500 Data
Load some useful R packages
Read the S&P500 data from a Google Sheet into a tibble
We will analyze a real-world, recent dataset containing information about the S&P500 stocks, sourced from TradingView.com. [3]
The dataset is located in a Google Sheet and periodically updated.
The complete URL of the Google Sheet that has the data is
https://docs.google.com/spreadsheets/d/14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ/
Its Google Sheet ID is:
14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ
.
Loading the data into R
- We can use the function
gsheet2tbl
in packagegsheet
to read the Google Sheet into a tibble , as demonstrated in the following code.
# Read S&P500 stock data present in a Google Sheet.
library(gsheet)
<- "https://docs.google.com/spreadsheets/d/"
prefix <- "14mUlNNpeuV2RouT9MKaAWKUpvjRijzQu40DdWJgyKPQ"
sheetID <- paste(prefix,sheetID) # Form the URL to connect to
url500 <- gsheet2tbl(url500) # Read it into a tibble sp500Data
- Note: This data is current, as of Fri, Jan 5, 2024
S&P Global Industry Classification Standard (GICS®)
- In this case study, we will classify and analyze the S&P 500 stocks based on the GICS standard!
- The Global Industry Classification Standard (GICS®) was developed in 1999 by S&P Dow Jones Indices and MSCI. The GICS methodology aims to enhance the investment research and asset management process for financial professionals worldwide. The GICS methodology has been widely accepted as an industry analysis framework for investment research, portfolio management and asset allocation. [4]
- The GICS classification consists of 11 sectors, – {Communication Services, Consumer Discretionary, Consumer Staples, Energy, Financials, Health Care, Industrials, Information Technology, Materials, Real Estate, Utilities}. The classification of each stock in the S&P 500 according to GICS is available at the following Google Sheet:
https://docs.google.com/spreadsheets/d/1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgk/
- For this file, the Google Sheet ID is
1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgk
and we read this classification data into a tibble, we namegics
, using similar code.
# Read GICS classificaiton of S&P 500 stocks from a Google Sheet.
library(gsheet)
<- "https://docs.google.com/spreadsheets/d/"
prefix2 <- "1WrVA8dPYvQsc_mXVctgTntRLS02qd7ubzcdAsw03Lgk"
sheetID2 <- paste(prefix2, sheetID2) # Form the URL to connect to
urlgics <- gsheet2tbl(urlgics) # Read it into a tibble called gics gics
- Next, we join the two tibbles, using “Stock” as the key and name our joint tibble
sp500
, as follows.
# Merging dataframes
<- merge(sp500Data,
sp500
gics , id = "Stock")
Review the S&P 500 data
- The data corresponds to 3 companies that are part of the S&P500 and includes 40 data columns, as of Fri, Jan 5, 2024
dim(sp500)
[1] 3 40
- The first ten stocks in the S&P500 data, their GICS Sector and their recent prices are as follows:
%>%
sp500 select(Stock, Description, GICSSector) %>%
head(10) %>%
kable("html",
caption = "The first 10 companies in the S&P500") %>%
kable_styling()
Stock | Description | GICSSector |
---|---|---|
AMCR | Amcor plc | Materials |
F | Ford Motor Company | Consumer Discretionary |
VTRS | Viatris Inc. | Health Care |
- Data Columns
- The data comprises of the following 40 columns:
colnames(sp500)
[1] "Stock"
[2] "Date"
[3] "Description"
[4] "Sector"
[5] "Industry"
[6] "Market Capitalization"
[7] "Price"
[8] "52 Week Low"
[9] "52 Week High"
[10] "Return on Equity (TTM)"
[11] "Return on Assets (TTM)"
[12] "Return on Invested Capital (TTM)"
[13] "Gross Margin (TTM)"
[14] "Operating Margin (TTM)"
[15] "Net Margin (TTM)"
[16] "Price to Earnings Ratio (TTM)"
[17] "Price to Book (FY)"
[18] "Enterprise Value/EBITDA (TTM)"
[19] "EBITDA (TTM)"
[20] "EPS Diluted (TTM)"
[21] "EBITDA (TTM YoY Growth)"
[22] "EBITDA (Quarterly YoY Growth)"
[23] "EPS Diluted (TTM YoY Growth)"
[24] "EPS Diluted (Quarterly YoY Growth)"
[25] "Price to Free Cash Flow (TTM)"
[26] "Free Cash Flow (TTM YoY Growth)"
[27] "Free Cash Flow (Quarterly YoY Growth)"
[28] "Debt to Equity Ratio (MRQ)"
[29] "Current Ratio (MRQ)"
[30] "Quick Ratio (MRQ)"
[31] "Dividend Yield Forward"
[32] "Dividends per share (Annual YoY Growth)"
[33] "Price to Sales (FY)"
[34] "Revenue (TTM YoY Growth)"
[35] "Revenue (Quarterly YoY Growth)"
[36] "Technical Rating"
[37] "Index"
[38] "Security"
[39] "GICSSector"
[40] "GICSSubIndustry"
- The names of the data columns are self-explanatory. The Financial terms are explained in depth on multiple external websites such as www.Investopedia.com
Rename Data Columns
- The names of the data columns are lengthy and confusing. We will rename the data columns to make it easier to work with the data.
# Define a mapping of new column names
<- c(
new_names "Stock", "Date", "StockName", "Sector", "Industry",
"MarketCap", "Price", "Low52Wk", "High52Wk",
"ROE", "ROA", "ROIC", "GrossMargin",
"OperatingMargin", "NetMargin", "PE",
"PB", "EVEBITDA", "EBITDA", "EPS",
"EBITDA_YOY", "EBITDA_QYOY", "EPS_YOY",
"EPS_QYOY", "PFCF", "FCF",
"FCF_QYOY", "DebtToEquity", "CurrentRatio",
"QuickRatio", "DividendYield",
"DividendsPerShare_YOY", "PS",
"Revenue_YOY", "Revenue_QYOY", "Rating",
"Security", "GICSSector", "GICSSubIndustry"
)# Rename the columns using the new_names vector
colnames(sp500)<-new_names
- We review the column names again after renaming them, using the
colnames()
function.
colnames(sp500)
[1] "Stock" "Date" "StockName"
[4] "Sector" "Industry" "MarketCap"
[7] "Price" "Low52Wk" "High52Wk"
[10] "ROE" "ROA" "ROIC"
[13] "GrossMargin" "OperatingMargin" "NetMargin"
[16] "PE" "PB" "EVEBITDA"
[19] "EBITDA" "EPS" "EBITDA_YOY"
[22] "EBITDA_QYOY" "EPS_YOY" "EPS_QYOY"
[25] "PFCF" "FCF" "FCF_QYOY"
[28] "DebtToEquity" "CurrentRatio" "QuickRatio"
[31] "DividendYield" "DividendsPerShare_YOY" "PS"
[34] "Revenue_YOY" "Revenue_QYOY" "Rating"
[37] "Security" "GICSSector" "GICSSubIndustry"
[40] NA
Understand the Data Columns
- Our next goal is to gain a deeper understanding of what the data columns mean. We reorganize the column names into eight tables, labeled Table 1a, 1b.. 1h.
- The column names described in Table 1a. concern basic Company Information of each stock.
ColumnName | Description |
---|---|
Stock | Stock Ticker (e.g. AAL) |
Date | Date (e.g. "7/15/2023") |
StockName | Name of the company (e.g "American Airlines Group, Inc.") |
GICSSector | Sector, as per GICS Classification |
GICSSubIndustry | Sub-Industry, as per GICS Classification |
MarketCap | Market capitalization of the company |
Price | Recent Stock Price |
- The column names described in Table 1b. are related to Technical Analysis, including the 52-Week High and Low prices.
ColumnName | Description |
---|---|
Low52Wk | 52-Week Low Price |
High52Wk | 52-Week High Price |
Rating | Technical Rating |
- The column names described in Table 1c. are related to the Profitability of each stock.
ColumnName | Description |
---|---|
ROE | Return on Equity |
ROA | Return on Assets |
ROIC | Return on Invested Capital |
GrossMargin | Gross Profit Margin |
OperatingMargin | Operating Profit Margin |
NetMargin | Net Profit Margin |
- The column names described in Table 1d are related to the Earnings of each stock.
ColumnName | Description |
---|---|
PE | Price-to-Earnings Ratio |
PB | Price-to-Book Ratio |
EVEBITDA | Enterprise Value to EBITDA Ratio |
EBITDA | EBITDA |
EPS | Earnings per Share |
EBITDA_YOY | EBITDA Year-over-Year Growth |
EBITDA_QYOY | EBITDA Quarterly Year-over-Year Growth |
EPS_YOY | EPS Year-over-Year Growth |
EPS_QYOY | EPS Quarterly Year-over-Year Growth |
- The column names described in Table 1e are related to the Free Cash Flow of each stock.
ColumnName | Description |
---|---|
PFCF | Price-to-Free Cash Flow |
FCF | Free Cash Flow |
FCF_QYOY | Free Cash Flow Quarterly Year-over-Year Growth |
- The column names described in Table 1f concern the Liquidity of each stock.
ColumnName | Description |
---|---|
DebtToEquity | Debt-to-Equity Ratio |
CurrentRatio | Current Ratio |
QuickRatio | Quick Ratio |
- The column names described in Table 1g are related to the Revenue of each stock.
ColumnName | Description |
---|---|
PS | Price-to-Sales Ratio |
Revenue_YOY | Revenue Year-over-Year Growth |
Revenue_QYOY | Revenue Quarterly Year-over-Year Growth |
- The column names described in Table 1h are related to the Dividends of each stock.
ColumnName | Description |
---|---|
DividendYield | Dividend Yield |
DividendsPerShare_YOY | Annual Dividends per Share Year-over-Year Growth |
Stock Prices, 52-Week Low, High; Market Cap in Billions
We want to analyze stock prices relative to their 52 Week Low and 52 Week High respectively, to understand their relative price attractiveness.
Hence, a new column named Low52WkPerc
is being added. The column contains the percentage change between the current price (Price
) and its 52-week low (Low52Wk
). The formula used is: \[Low52WkPerc = \frac{(CurrentPrice - 52WeekLow)*100}{52WeekLow}\]
Another column named High52WkPerc
represents the percentage change between the 52-week high (High52Wk
) and the current price (Price
). We round off the data to two decimal places for clarity.
References
S&P 500
[1] https://www.investopedia.com/terms/s/sp500.asp
[2] S&P Global: S&P Global. (n.d.). S&P 500. Retrieved September 14, 2023, from https://www.spglobal.com/spdji/en/indices/equity/sp-500/
MarketWatch: MarketWatch. (n.d.). S&P 500 Index. Retrieved September 14, 2023, from https://www.marketwatch.com/investing/index/spx
Bloomberg: Bloomberg. (n.d.). S&P 500 Index (SPX:IND). Retrieved September 14, 2023, from https://www.bloomberg.com/quote/SPX:IND
[3] TradingView.com https://www.tradingview.com/screener/
[4] GICS: Global Industry Classification Standard: https://www.spglobal.com/spdji/en/landing/topic/gics/