Financial Statement Analysis Using Bayesian Methods

Overview

This is a personal project where I looked into whether the financial ratios I’ve been using in my own investing strategy actually have any real effect on stock price changes, specifically in the Turkish stock market, during a one year period following the release of 2023 Q4 financial statements.

The approach was simple: take the ratios I already use (some standard, some custom), run some exploratory checks, and then fit a Bayesian hierarchical linear regression model to see if any of them can explain price movements across sectors.

As expected, no strong linear relationship showed up between most ratios and the response. But interestingly, one of the key ratios I already rely on, IncomeRatio, showed a significant effect in a few sectors. Another ratio, Investing, appeared to have a small negative effect overall in the second model.

This isn’t meant to be predictive. The Turkish market is volatile, thinly traded in places, and often driven by factors beyond fundamentals. But I wanted to run this analysis just to check if my personal logic holds up when viewed statistically, and in a few small ways, it kind of does.

Introduction

Financial markets, whether money or capital markets, end up crossing everyone’s path at some point in life. For me, that moment came when I was 16. I was drawn to the ever changing prices of stocks, the high risk world of leveraged forex trading, the volatile crypto markets, and the relative safety of investment funds. Over the last 10 years, I’ve dabbled in nearly all of them, even if only on a small scale.

Of course, with time and experience, your perspective shifts. In the end, I came to believe that stock markets, among others, are relatively less prone to manipulation and more predictable, especially in the long run. After all, the financial health of publicly traded companies should, at least in theory, be reflected in their share prices over time.

The aim of this project is to explore, in a more concrete and analytical way, a question I’ve already formed an intuitive answer to, at least within the context of the Turkish capital markets.

Financial Statements

Publicly traded companies are required to disclose their financial status to the public at certain times of the year. The three primary types of financial statements are: the balance sheet, the income statement, and the cash flow statement. Most companies release these reports four times a year, usually every quarter, though the exact timing can vary a bit by industry. Investors, ideally, use these reports to help them decide where to put their money.

Three core financial reports and what they show, briefly:

Income Statement: This report shows a company’s sales, costs, and profits over a set period – usually a quarter or a whole year. It basically tells how the company is performing in its day-to-day business.
Some items: Revenue, Cost of Goods Sold, Operating Income, Net Profit.

Balance Sheet: Provides a snapshot of a company’s financial position at a specific point in time. It shows what the company owns (assets), what it owes (liabilities), and the residual value to shareholders (equity).
Some items: Total Assets, Accounts Payable, Shareholder Equity, Retained Earnings.

Cash Flow Statement: Tracks the inflow and outflow of cash during a reporting period. It breaks cash activity into operating, investing, and financing activities, helping investors understand liquidity and cash management.
Some items: Net Cash from Operations, Capital Expenditures, Debt Issuance, Dividends Paid.

Each report contains many different line items. Over the years, professional investors and financial analysts have suggested various financial ratios using the data in these reports to summarize a company’s status in terms of profitability, efficiency, leverage, liquidity, and other dimensions. These ratios are still widely used by investors to inform decision-making and compare companies across time and industries.

For a comprehensive overview of financial ratios:
White, G. I., Sondhi, A. C., & Fried, D. (2002). The analysis and use of financial statements. John Wiley & Sons.

Recent academic studies have shown some of these financial ratios can partially explain the stock price changes of companies listed on the Turkish stock exchange (Uyar & Sarak, 2020; Sarı & Kırkık, 2019; Tekin & Bastak, 2021).

The research question of this project follows a similar line of inquiry:

Do financial indicators of a company affect its stock price change?

However, I’ve reframed this question slightly to focus on my own experience:

Do the custom financial ratios I personally used when selecting companies to invest in help explain stock price changes?

I’ll describe these personally used ratios in more detail in the upcoming sections.

Data

I gathered the dataset from various sources. All data files (.csv) are available at /Financial-Ratio-Analysis/Data, along with the R scripts (/Financial-Ratio-Analysis/R-script) used for data acquisition and formatting.

1. Stock List and Sector Classification

I grabbed the list of companies and what sectors they belong to from the Public Disclosure Platform (KAP, in Turkish). That’s a government website where Turkish companies share all their official financial and corporate info. Each company is grouped into a main sector and a sub-sector. I decided to use the main sectors for sorting, except in the manufacturing and financial categories where there were just too many companies. I made some tweaks there to keep things balanced.

2. Financial Statements

I pulled financial data from Yahoo Finance using its hidden API (the full script is available at /Financial-Ratio-Analysis/R-script). Yahoo Finance allows access to the last four annual reports and last five quarterly reports for publicly traded companies.

Since this project focuses on long-term financial fundamentals, I chose to use annual reports.

Yahoo’s API also gives you the option to select specific line items. Each report contains over 150 financial indicators, so retrieving everything would be inefficient. Instead, I selected a targeted set of variables that were most relevant to my analysis.

Below are the financial items I retrieved for each stock on the Turkish stock market:

Income Statement: Total Revenue Cost of Revenue Net Income

Balance Sheet: Total Assets Total Liabilities (Net of Minority Interest) Current Liabilities Total Equity (Gross of Minority Interest) Working Capital

Cash Flow Statement: Investing Cash Flow Share Issued

These variables formed the basis for computing custom financial ratios, which I explain in detail in later sections.

3. Historical Market Prices (in TRY)

For stock price data, I used the quantmod package in R, which provides a convenient interface to fetch historical prices from sources like Yahoo Finance.

To evaluate how stock prices changed following the release of financial statements, I defined pre- and post-reporting periods, each lasting 40 trading days. I then pulled the adjusted closing prices for these windows and calculated a weighted average price for each period, for every stock.

The weighting was designed so that:

Earlier prices in the pre period (coded as before in the analysis) received higher weights (reflecting the price level before the report’s influence),

Later prices in the post period (coded as after) received higher weights (to capture the full market response).

Pre-period: 2024-02-06 to 2024-04-01
Post-period: 2025-02-15 to 2025-04-15

The percentage change between the weighted pre and post prices was used as the response variable in the analysis. I’ll get into the details of how I picked those periods later on.

Exploratory Analysis of Sector and Market Cap Based Stock Behavior

For this part of the project, I wanted to explore the data by focusing on company size, using market capitalization as a guide. I thought it would be interesting — especially considering the broader economic environment during the period I analyzed.

Between early 2024 and early 2025, Turkey faced a tough financial climate. Inflation was high (change in CPI ~ 83% for 2024), the interest rate was stuck at 50%, and the global outlook wasn’t any better. There were active wars close by, uncertainty in global markets, and a general sense of instability everywhere. Naturally, all of this had an impact on Turkey’s economy and, by extension, its stock market.

In times like these, investor behavior tends to change. People pull back from risky investments, and money usually flows into safer, more stable assets. That’s why company size can matter: large-cap companies often weather the storm better, while small and mid-cap companies tend to be more volatile.

Before diving into the details of the financial ratios and how they relate to price changes, I thought this would be a good starting point—to get a feel for how different sizes of companies reacted during a very unstable year.

a) Data Preperation

Load required packages
Import raw datasets
Prepare cleaned data frames used throughout the EDA

Note: Comments inside the code are shown in square brackets ([]) and explained below.

# For data wrangling
library(tidyverse)

# For plotting
library(gridExtra)
library(corrplot)
library(corrplot)
library(plotly)
library(ggpubr)
library(bayesplot)

# MCMC & Bayesian tools
library(LaplacesDemon)
library(coda)
library(rjags)

# Importing raw datasets
df_annual <- read_csv("/Financial-Ratio-Analysis/Data/df_annual.csv")
prices_annual <- read_csv("/Financial-Ratio-Analysis/Data/prices_annual.csv")

# Merging raw dataframes
annual_data_raw <- merge(prices_annual, df_annual, by.x = "stock") %>%
  mutate(across(-c(1:4), ~ ./1e6))          # [1]

# Filtering stocks missing price or market cap data
annual_data <- annual_data_raw %>% 
  mutate(PriceChange = 100 * (after - before) / before,         # [2]
         MarketCap = `2024ShareIssued` * before) %>%            # [3]
  select(stock, sector, PriceChange, MarketCap) %>%
  filter(!is.na(PriceChange))

# Saving stocks that have no marketcap data for later use
annual_data_no_marketcap <- annual_data_raw %>% 
  mutate(PriceChange = 100 * (after - before) / before,
         MarketCap = `2024ShareIssued` * before) %>%
  filter(!is.na(PriceChange) & is.na(MarketCap)) %>%
  select(stock, sector, PriceChange, MarketCap)

[1] Values are scaled to millions for cleaner interpretation.
[2] PriceChange = Percent change in adjusted closing prices between the pre- and post-periods.
[3] We’re using adjusted closing prices, which account for stock splits and dividends. Since adjusted prices are aligned with today’s share structure, the share count (ShareIssued) must also reflect the current number.

head(annual_data, 5)

##   stock                                           sector PriceChange MarketCap
## 1 A1CAP                                 BROKERAGE HOUSES     -25.000    4320.0
## 2 ACSEL CHEMICALS, PETROLEUM RUBBER AND PLASTIC PRODUCTS     -31.012    1662.9
## 3  ADEL                     OTHER MANUFACTURING INDUSTRY     -38.521   13357.6
## 4 ADESE                           REAL ESTATE ACTIVITIES     -32.143    2822.4
## 5 ADGYO                    REAL ESTATE INVESTMENT TRUSTS     -15.169   10455.7

b) Company Count per Sector

To better understand the structure of the data, I first looked at how many companies were represented in each sector. Some sectors turned out to have only a single company, which makes any kind of sector-level comparison or modeling practically meaningless. So, I decided to exclude those sectors from the sector-based part of the analysis.

company_count_by_sector <- annual_data %>%
  group_by(sector) %>%
  summarise(count = n()) %>%
  arrange(desc(-count)) 

g <- company_count_by_sector %>%
  mutate(low = factor(ifelse(count < 2, 0, 1))) %>%
  ggplot(aes(x = reorder(sector, -count),
             y = count,
             text = sector)) +
  geom_bar(aes(fill = low), stat = "identity") +
  labs(title = "Company Count by Sector", 
       x = "Sector",
       y = "Count") +
  theme_minimal(base_size = 13) +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
        legend.position = "none") 


ggplotly(g, tooltip = "sector")

As you can see above, some sectors include only a single company.

# One-company sectors
one_comp_sectors <- annual_data %>%           
  group_by(sector) %>%
  summarise(count = n()) %>%
  filter(count == 1) %>%
  pull(sector)

print(one_comp_sectors)

## [1] "FINANCING COMPANIES"          "OTHER MANUFACTURING INDUSTRY"

annual_data_exc_sectors <- annual_data %>%      # [1]
  filter(!sector %in% one_comp_sectors)

[1] Annual data with one-company sectors excluded for further sector-based analysis

c) Price Change by Sector

Sometimes the market gets pulled in a specific direction by hype around certain sectors. A few years ago, electric vehicles and green energy were all over the news. More recently, it’s AI companies getting all the attention—often with price moves that go way beyond what the financials would suggest. In Turkey, we’ve seen similar waves, like the spike in defense, energy, and electricity stocks following the start of the Russia–Ukraine conflict. So in this analysis, I wanted to check: did anything like that happen during the quieter and more difficult financial period between early 2024 and early 2025?

g <- annual_data_exc_sectors %>%
  ggplot(aes(x = sector,
             y = PriceChange,
             fill = sector)) +
  geom_boxplot() +
  geom_hline(yintercept = 0, col = "red", linewidth = 0.5, linetype = "dashed") +
  labs(title = "Price Change by Sector",
       x = "Sectors",
       y = "Price Change (%)") +
  theme_minimal(base_size = 13) +
  theme(axis.text.x = element_blank(),  
        axis.ticks.x = element_blank(),
        legend.position = "none",
        plot.title = element_text(face = "bold")) +
  scale_y_continuous(breaks = seq(-220, 250, 20)) +
  coord_cartesian(ylim = c(-210, 250)) +
  geom_point(aes(x = sector, y = -220, text = sector),
             size = 3)

g <- ggplotly(g, tooltip = "text")

ggplotly(g, tooltip = "text")

Looking at the boxplot, you can tell that most sectors didn’t do so well. Two-thirds of them had a negative median price change. But at the same time, we see more positive outliers than negative ones—some stocks skyrocketed. That gap between the few big winners and the rest suggests a very uneven, unstable market. In other words, the market overall didn’t do well—but a handful of stocks did really well.

The top-performing sector was Administrative and Support Service Activities:

##     stock PriceChange MarketCap
## 85  BORLS     144.536    6174.4
## 109 CEOEM      50.673     981.2
## 174 ESCAR     158.081    9900.0
## 187  FLAP     -41.667    1237.5
## 221 GZNMI     157.908    2548.0
## 311 LIDER     164.458   10956.0
## 394 PLTUR      38.298    4601.3
## 443 SNKRN     196.800        NA

Out of 8 companies in this sector, 5 more than doubled their market price. That’s impressive. Two of them are fleet leasing companies, two are tourism-related (travel agencies), and one is a cyber-security firm. The cyber-security stock probably got a boost just from being the only one in its niche, it’s easy to imagine it being caught up in global tech hype.

As for the tourism and fleet leasing firms, some might explain their performance by pointing to high inflation and interest rates. Under those conditions, companies often prefer to outsource travel and transport instead of taking on those rising costs themselves.

Personally, I think that explanation might be part of the story, but there’s more to it. These companies are small caps (the largest has a market cap of around $300 million), which makes them much more prone to speculative trading. A bit of hype, a headline, or a short-term spike in demand can cause big price swings. Their movements are often sharp and volatile.

We could easily test this idea by looking at their daily price data and volatility, but that’s outside the scope of this analysis, so I’ll leave it there for now.

d) Introducing Company Size

From this point forward, it makes sense to explore the data with one more layer of context: company size. Specifically, I grouped companies based on their market capitalizations to see whether investors might have shown a preference for larger or smaller firms during this period.

annual_data <- annual_data %>%
  filter(!is.na(MarketCap))        # [1]

log10_market_cap <- log10(annual_data[annual_data$MarketCap > 0,]$MarketCap)    # [2] [3]

mu <- mean(log10_market_cap)
sigma <- sd(log10_market_cap) 

lower_th <- 10^(mu - sigma)    # [4]
upper_th <- 10^(mu + sigma)

#New variable: Company size
annual_data <- annual_data %>%
  mutate(size = case_when(
    MarketCap < lower_th ~ "Small",
    between(MarketCap, lower_th, upper_th) ~ "Medium",
    MarketCap > upper_th ~ "Large")) 

annual_data_no_marketcap <- annual_data_no_marketcap %>%     # [5]
  mutate(size = "Small")

annual_data <- rbind(annual_data, annual_data_no_marketcap)  # [6]

[1] Filtering out companies with missing market cap data
[2] Log transformation helps scale down large market cap values
[3] A single company had a negative market cap — likely a data entry error. It’s ignored here.
[4] I split companies using ±1 standard deviation from the mean of log-transformed market cap: 17% small, 68% medium, 17% large.
[5] Companies with missing market cap data were likely excluded from Yahoo Finance’s financials due to their small size — so it’s reasonable to assume they are small-cap and label them accordingly.
[6] Re-integrating these companies back into the main dataset to include them in the size-based analysis.

e) Price Change by Company Size

1-Range

We can quickly check the PriceChange range for different size of companies.

range <- annual_data %>% 
  group_by(size) %>%
  summarize(lowest = min(PriceChange), highest = max(PriceChange)) %>%
  mutate(size_numeric = as.numeric(factor(size)))

range %>% 
  ggplot(aes(size_numeric, col = size)) +
  geom_linerange(aes(ymin = lowest, ymax = highest), linewidth = 1.5) +
  geom_segment(aes(x = size_numeric - 0.2, xend = size_numeric + 0.2, 
                   y = lowest, yend = lowest), 
               linewidth = 1.5,
               show.legend = FALSE) +
  geom_segment(aes(x = size_numeric - 0.2, xend = size_numeric + 0.2, 
                   y = highest, yend = highest), 
               linewidth = 1.5,
               show.legend = FALSE) +
  scale_y_continuous(breaks = seq(-200, 600, 50)) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  scale_color_discrete(name = "Size",
                       breaks = c("Small", "Medium", "Large"),
                       labels = c("Small", "Medium", "Large")) +
  labs(title = "Price Change (Range) by Size", x = "", y = "Price Change (%)") +
  theme_minimal(base_size = 12) +
  theme(title = element_text(face = "bold"),
        axis.ticks.y = element_blank(), 
        axis.text.y = element_blank(),
        legend.title = element_text(face = "bold", size = 14),
        legend.key = element_blank(),
        legend.text = element_text(size = 11),
        legend.key.size = unit(2, "lines")) +
  coord_flip()

It’s not surprising to see this pattern. Small-cap companies tend to be more volatile, they offer the potential for higher returns, but also come with the risk of steeper losses. On the other hand, large-cap companies are usually more stable and less reactive to short-term noise, but that stability often comes with more modest returns.

That said, this plot only shows the range of price changes, not the distribution. To get a clearer picture of how these changes are spread out, we need to look at a density plot next.

2-Density Check

sign_log <- function(val){           
  sign(val) * log1p(abs(val))       # [1] 
}

data_scaled <- annual_data %>%
  mutate(PriceChange_sc = sign_log(PriceChange))   # [2]

[1] Handles both negative and positive values using log1p for numerical stability
[2] Scaling PriceChange

ggplot(data_scaled, aes(x = PriceChange_sc, fill = size)) +
  geom_density(aes(col = size), alpha = 0.3) +
  scale_fill_discrete(name = "Size",
                      breaks = c("Small", "Medium", "Large"),
                      labels = c("Small", "Medium", "Large")) +
  scale_color_discrete(guide = "none") + 
  labs(title = "Density of Price Change by Company Size",
       x = "Price Change (Custom Log Scale)", y = "Density") +
  theme_minimal(base_size = 13) +
  theme(title = element_text(face = "bold", size = 15),
        legend.text = element_text(size = 15),
        legend.title = element_text(size = 16)) +
  scale_x_continuous(limit = c(-8, 8)) +
  geom_vline(xintercept = 0, linetype = "dashed", col = "red")

As expected, just looking at the range doesn’t tell the full story. The density plot adds more nuance: small- and mid-cap companies have more density on the negative side, which means a larger portion of them lost value. Large-cap companies, on the other hand, show more density on the positive side — suggesting more of them posted gains.

But this doesn’t necessarily mean that large caps performed better on average. It just tells us that a bigger share of them had positive returns. If we want to understand performance more accurately, we’ll need to compare medians and run some tests, and that’s exactly what we’ll do next.

3-Median Comparison

small <- annual_data[annual_data$size == "Small", ]$PriceChange
medium <- annual_data[annual_data$size == "Medium", ]$PriceChange
large <- annual_data[annual_data$size == "Large", ]$PriceChange

#Boxplot stats
outlier_data <- annual_data %>%
  group_by(size) %>%
  summarise(median = median(PriceChange),
            Q1 = quantile(PriceChange, 0.25),
            Q3 = quantile(PriceChange, 0.75),
            IQR = Q3 - Q1,
            upper_end = Q3 + 1.5 * IQR,
            lower_end = Q1 - 1.5 * IQR,
            upper_whisker = max(PriceChange[PriceChange <= upper_end]),
            lower_whisker = min(PriceChange[PriceChange >= lower_end]), 
            upper_outlier_count = sum(PriceChange > upper_end),
            lower_outlier_count = sum(PriceChange < lower_end)) %>%
  mutate(label_upper = paste0("Outlier Count = ", upper_outlier_count),
         label_lower = paste0("Outlier Count = ", lower_outlier_count))

annual_data %>%
  ggplot(aes(x = factor(size, levels = c("Large", "Medium", "Small")),
             y = PriceChange,
             fill = size)) +
  geom_boxplot(outlier.shape = NA, alpha = 0.7) +           # [1]
  geom_hline(yintercept = 0, col = "red", linewidth = 0.5, linetype = "dashed") +
  labs(title = "Price Change by Company Size",
       x = "Company Size",
       y = "Mean Price Change (%)") +
  theme_minimal(base_size = 13) +
  theme(legend.position = "none",
        plot.title = element_text(face = "bold")) +
  scale_y_continuous(breaks = seq(-100, 125, 25)) +
  coord_cartesian(ylim = c(-100, 125)) +
  geom_text(data = outlier_data,                   # [2]
            aes(x = size, 
                y = median, 
                label = round(median, 2), 
                col = size),
            hjust = c(-2.45, -2.0, -2.5), 
            vjust = c(0.7, 0.7, 0.15), 
            size = 4.5, 
            fontface = "bold") +
  geom_text(data = outlier_data %>% filter(upper_outlier_count > 0),        # [3]
            aes(x = factor(size, levels = c("Large", "Medium", "Small")),
                y = upper_whisker,
                label = label_upper),
            vjust = -1,
            fontface = "italic",
            size = 4) +
  geom_text(data = outlier_data %>% filter(lower_outlier_count > 0),        
            aes(x = factor(size, levels = c("Large", "Medium", "Small")),
                y = lower_whisker,
                label = label_lower),
            vjust = 2,
            fontface = "italic",
            size = 4)

[1] Outliers are not visually represented in this plot for better visuality.
[2] Medians
[3] The number of outliers are shown at the top of the whiskers.

The boxplot suggests that large cap companies performed better than both small and medium caps. Let’s quickly test that to be sure.

4-Bayesian Bootstrap Test for the Difference in Medians

We can perform a comparison test to see whether the PriceChange difference between company sizes is statistically meaningful. A standard t-test might come to mind, but it comes with some heavy assumptions, like equal variances and normality, which clearly don’t hold here. Our data has unequal variances and extreme outliers (respectively: var(X), plot(density(X))), and using the mean as a summary measure would be misleading.

That’s why a non parametric, median based approach makes more sense. A Wilcoxon rank sum test could work, but I went for something more flexible: a Bayesian bootstrap, which gives the full posterior distributions of median differences. It’s not just a stylistic choice, this is a Bayesian statistics project after all, but it’s practically useful. The Bayesian bootstrap gives smoother samples, credible intervals, and probability-based statements about median differences.

Since the standard error of the median has no closed-form solution, we have to use bootstrapping anyway. And by setting α = 0.5, we can give a bit more weight to potential outliers.

While this isn’t a “test” in the frequentist sense, we can assess whether the 95% credible interval for the median difference includes zero. If it does, then the data does not provide strong evidence for a meaningful difference.

weighted_med <- function(sample, weights) {      # [1]
  ordered_s <- order(sample)    
  sorted_sample <- sample[ordered_s]
  sorted_weight <- weights[ordered_s]    
  cumsum_w <- cumsum(sorted_weight)
  sorted_sample[which(cumsum_w >= 0.5)[1]]
}

# Bayesian Bootstrapping function
bayesian_bootstrap <- function(X, Y, alpha, iter = 10000) {
  
  n_X <- length(X)
  n_Y <- length(Y)
  diffs <- vector("numeric", iter)
  
  for (i in 1:iter) {
    
    weight_X <- LaplacesDemon::rdirichlet(1, rep(alpha, n_X))        # [2]
    weight_Y <- LaplacesDemon::rdirichlet(1, rep(alpha, n_Y))
    
    med_x <- weighted_med(X, weight_X)  
    med_y <- weighted_med(Y, weight_Y)
    
    diffs[i] <- med_x - med_y       # [3]
  }
  
  return(diffs)
}

s_m_diffs <- bayesian_bootstrap(small, medium, alpha = 0.5)    
s_l_diffs <- bayesian_bootstrap(small, large, alpha = 0.5)
m_l_diffs <- bayesian_bootstrap(medium, large, alpha = 0.5)

[1] Calculates the weighted median using cumulative weights. Different from a weighted mean: it returns the value where the cumulative sum of weights crosses 0.5
[2] Dirichlet Weights
[3] Median difference

samples <- list(list(name = "s_m_diffs", children = "Small - Medium"),
                list(name = "s_l_diffs", children = "Small - Large"),   
                list(name = "m_l_diffs", children = "Medium - Large"))


par(mfrow = c(2, 2))
 
for(i in 1:3){
  hist(get(samples[[i]]$name), breaks = 40, col = "skyblue", border = "white",
       main = paste0("Median Difference (", samples[[i]]$children, ")"),
       xlab = "Difference in Weighted Medians")
  
  abline(v = quantile(get(samples[[i]]$name), c(0.025, 0.975)), 
         col = "darkgreen", lty = 2)
  
  abline(v = 0, col = "red", lwd = 2)
}

par(mfrow = c(1, 1))

As it turns out, the 95% credible intervals for all comparisons include zero, suggesting that there isn’t strong evidence for a meaningful difference in median price changes across company sizes. While large-cap stocks initially appeared more resilient, this result implies that their apparent advantage may not be credibly different from the others. In short, company size alone likely didn’t drive performance during this period.

Next, we’ll take a closer look at the distribution of returns by company size, specifically, how many companies in each group fell into various performance brackets: gains of 0–30%, 30–60%, and over 60%, and losses in the same ranges. This should give us a more nuanced view of the risk/reward profile for different market cap stocks.

5- Risk/Reward

Imagine you’re picking a company at random, with only one condition: you get to choose the size of the company (small, medium, or large). You might end up selecting a little known small-cap stock and walk away with an impressive 218% return in just one year. Or maybe you go with a well-established large-cap firm and make a modest 32% gain, which, considering the inflation rate during this period, might actually amount to a real loss.

So the question becomes: was your outcome the result of a smart decision based on company size, or were you simply lucky? And more importantly, does the potential return justify the risk you’re taking, purely based on company size?

#Setting up the data
annual_risk_return <- annual_data %>%
  mutate(low_profit = ifelse(between(PriceChange, 0, 30), 1, 0),      # [1]
         mid_profit = ifelse(between(PriceChange, 30, 60), 1, 0),
         high_profit = ifelse(PriceChange > 60, 1, 0),
         low_loss = ifelse(between(PriceChange, -30, 0), 1, 0),
         mid_loss = ifelse(between(PriceChange, -60, -30), 1, 0),
         high_loss = ifelse(PriceChange < -60, 1, 0))


proportion_df <- annual_risk_return %>%
  group_by(size) %>%
  summarise(n = n(), 
            low_profit = sum(low_profit),
            mid_profit = sum(mid_profit),
            high_profit = sum(high_profit),
            low_loss = sum(low_loss),
            mid_loss = sum(mid_loss),
            high_loss = sum(high_loss)) %>%
  mutate(prop_low_profit = low_profit / n,       # [2]
         prop_mid_profit = mid_profit / n,
         prop_high_profit = high_profit / n,
         prop_low_loss = low_loss / n,
         prop_mid_loss = mid_loss / n,
         prop_high_loss = high_loss / n) %>%
  select(size, prop_low_profit, prop_mid_profit, prop_high_profit, prop_low_loss, prop_mid_loss, prop_high_loss)

[1] Profit and loss categories were defined symmetrically around zero:
Low: between 0% and ±30%
Medium: between ±30% and ±60%
High: greater than ±60% (either gain or loss)
[2] Proportions

grid.arrange(a, b, ncol = 1)

Note: I hid the plotting code because it was unnecessarily long.

The bar plots reveal a clear message: during this period, choosing stocks based solely on company size does not offer a compelling risk/reward trade-off. Smaller-cap stocks do provide a higher upside, you are more likely to hit a big winner with returns above 60%. However, they also carry a much greater risk. The likelihood of experiencing a moderate to severe loss (0–60%) is significantly higher among small caps.

Large-cap stocks, on the other hand, do not necessarily provide safer or more predictable returns. While their downside risk is somewhat lower, they still show substantial exposure to loss, and their upside is limited. This suggests that larger size does not guarantee stability or protection in such an unstable market environment.

So, if you’re picking a stock at random within a size category, the expected return does not appear to justify the risk, particularly when inflation is considered. This doesn’t mean company size is irrelevant, but it’s clearly not sufficient on its own to explain or guide investment outcomes.

For a more robust and statistically grounded assessment of risk and reward, one would need to account for factors such as volatility, sectoral dynamics, beta, and perhaps macroeconomic indicators. That said, these are outside the scope of the current analysis.

Statistical Analysis and Bayesian Data Modeling

I’ve already talked about how financial statements can be distilled into ratios that reflect a company’s profitability, efficiency, leverage, liquidity, and more. Some of these, like ROA, ROE, P/E, D/E, EPS, etc., have been researched extensively, used in textbooks, and cited for decades.

What I wanted to explore here is more personal: if I had applied the same strategy I’ve been casually using for the past five years, right after the release of 2023-Q4 financial statements (between February and April 2024), and held those stocks for about a year, would the ratios I rely on have actually helped explain the price changes? And if so, to what extent?

To answer that, I used a Bayesian hierarchical linear regression model, a method that allows me to pool information across different sectors while still accounting for sector-specific effects.

In this final section;

1. I’ll walk through the specific ratios I’ve personally been using over the past five years, which is roughly half the time I’ve been actively investing.

2. I’ll explore whether there’s any linear relationship between these ratios and stock price changes.

3. Then, I’ll move into a Bayesian modeling framework to test whether any of these ratios meaningfully explain price movements.

1. Financial Ratios

As I mentioned earlier, the goal here is to see whether the financial ratios I personally rely on have actually had any relationship with market price changes — at least during this specific period. Just to be perfectly clear: these ratios weren’t chosen based on any deep theoretical research. I picked them heuristically — meaning they simply made intuitive sense to me at the time. While the effectiveness of mainstream financial ratios has been tested over and over again, the ones I’ve been using haven’t really been examined — and that’s kind of the whole point.

IncomeChange: Percent change in net income compared to the previous period.
– This one’s really simple. If a company manages to increase its profits compared to the previous period, that probably means things are going well — and maybe they’ll keep going well. But more importantly, I assume other people in the market will think the same. So, even if things aren’t objectively improving, what really matters is whether others believe they are. If they do, they’ll buy — and that could push the price up. In that sense, it’s as much about perception as it is about fundamentals.

IncomeRatio: Net income divided by the company’s market cap.
– This is basically a rough profitability measure. It shows how much the company earned relative to it’s market cap. A high ratio could mean the company is undervalued — or that it’s risky and the market isn’t buying into its earnings. It’s not something I interpret in isolation, but it’s often a useful flag. I don’t have a benchmark. It varies too much by sector.

EquityChange: Percent change in total equity from the previous period.
– An increase in equity can mean retained profits, capital injections, or other balance sheet improvements. A decline might reflect losses, dividend payouts, or stock buybacks. I use this as a rough check on whether the company’s long-term position is getting stronger or weaker.

Payability: Percent change in working capital compared to the previous period.
– It’s not a textbook ratio. I have used this to track whether the company is improving or worsening its short-term financial position. Positive values suggest improved liquidity and negative values suggest tightening.

STermRatio: Current liabilities divided by total liabilities.
– This tells me how much of the company’s obligations are short-term. A higher number could mean more financial pressure in the near term — unless the company also has strong liquid assets. I usually read this alongside working capital.

Investing: Negative of investing cash flow divided by total equity.
– I flipped the sign because investing cash flows are typically negative (due to spending), and I wanted higher values to represent more aggressive reinvestment. This ratio gives me a sense of how much of a company’s equity is being funneled back into growth.

ProfitMargin: Net income divided by total revenue.
– It tells me what percent of the revenue actually ends up as profit. Higher margins often mean the company is efficient. It is a mainstream ratio.

DebtEquity: Total liabilities divided by total equity.
– A high value means the company is heavily reliant on debt to finance its operations.

Inflation: Percent change in the ratio of cost of revenue to total revenue compared to the previous year.
– This one’s a bit different from the rest. I built this custom metric as a proxy for how inflation is hitting companies at the operational level — kind of like a company-specific input cost inflation. It’s calculated by comparing the change in the cost-to-revenue ratio (CostOfRevenue / TotalRevenue) over two periods. If the ratio jumps, it might mean the company is facing production costs without being able to raise its prices proportionally. It’s not perfect, and probably not even close, but I wanted to see if it adds any useful signal — especially in periods where inflation is high but not always visible in other metrics.

2. Calculating the Variables and Setting Up the Data

Let’s calculate the variables using the financial line items we pulled from the YahooFinance website. I’m providing the full code from the beginning to keep everything easy to follow.

First, we’ll define a couple of helper functions:

scale_col <- function(col) {
  (col - mean(col, na.rm = TRUE)) / sd(col, na.rm = TRUE)     # [1]
}


mad_trim <- function(x, th = 10) {         # [2]
  
  med <- median(x, na.rm = TRUE)
  MAD <- mad(x, constant = 1, na.rm = TRUE)
  
  upper <- med + th * MAD
  lower <- med - th * MAD
  
  x[x > upper | x < lower] <- NA
  
  return(x)
}

[1] A custom standardization function that safely handles NAs.
[2] Some variables have extreme values that distort scaling. This function trims those using MAD (Median Absolute Deviation), which is more robust than standard deviation.

As we did in previous sections, we now load, merge, and prep our base data:

df_annual <- read_csv("/Financial-Ratio-Analysis/Data/df_annual.csv")
prices_annual <- read_csv("/Financial-Ratio-Analysis/Data/prices_annual.csv")

annual_data_raw <- merge(prices_annual, df_annual, by.x = "stock") %>%
    mutate(across(-c(1:4), ~ ./1e6))
  
annual_data_unscaled <- annual_data_raw %>% 
    mutate(PriceChange = (after - before) / before,
           MarketCap = `2024ShareIssued` * before) %>%
    filter(!is.na(PriceChange))

Now we define our predictors (financial ratios):

annual_data_unscaled <- annual_data_unscaled %>% 
    mutate(CorRatio2024 = `2024CostOfRevenue` / `2024TotalRevenue`,
           CorRatio2023 = `2023CostOfRevenue` / `2023TotalRevenue`,
           Inflation = (CorRatio2024 - CorRatio2023) / CorRatio2023,         
           IncomeChange = (`2023NetIncome` - `2022NetIncome`) / abs(`2022NetIncome`), 
           IncomeRatio = `2023NetIncome` / MarketCap,
           EquityChange = (`2023TotalEquityGrossMinorityInterest` - `2022TotalEquityGrossMinorityInterest`) / abs(`2022TotalEquityGrossMinorityInterest`),  
           Payability = (`2023WorkingCapital` - `2022WorkingCapital`) / abs(`2022WorkingCapital`),  
           STermRatio = `2023CurrentLiabilities` / `2023TotalLiabilitiesNetMinorityInterest`,
           Investing = -(`2023InvestingCashFlow` / `2023TotalEquityGrossMinorityInterest`),  
           ProfitMargin = `2023NetIncome` / `2023TotalRevenue`,
           DebtEquity = `2023TotalLiabilitiesNetMinorityInterest` / `2023TotalEquityGrossMinorityInterest`
           ) %>%
    select(c(stock, sector, PriceChange, (length(.) - 8):length(.)))

Final cleanup and scaling:

annual_data_unscaled <- annual_data_unscaled %>%    # [1]
  mutate(across(4:length(.), mad_trim))
  
annual_data_scaled <- annual_data_unscaled %>%      # [2]
  mutate(across(3:length(.), scale_col)) 
  
  
annual_data <- annual_data_scaled %>%             # [3]
  filter(if_all(3:length(.), ~ is.na(.) | (. > -3 & . < 3)))

[1] Removing extreme values that could distort standardization.
[2] Standardizing both predictors and the response variable.
[3] Reducing working space to |z| < 3

Let’s take a quick look at the final dataset:

head(as_tibble(annual_data), 5)

## # A tibble: 5 × 12
##   stock sector       PriceChange Inflation IncomeChange IncomeRatio EquityChange
##   <chr> <chr>              <dbl>     <dbl>        <dbl>       <dbl>        <dbl>
## 1 A1CAP BROKERAGE H…      -0.426    -0.134       -0.770      -0.879        1.34 
## 2 ACSEL CHEMICALS, …      -0.509     0.149       -0.350      -0.331       -0.166
## 3 ADEL  OTHER MANUF…      -0.612    -0.724       NA          -0.211        1.06 
## 4 ADESE REAL ESTATE…      -0.524     0.588        0.331      NA            0.179
## 5 ADGYO REAL ESTATE…      -0.290     0.428       -0.460      -0.167        1.84 
## # ℹ 5 more variables: Payability <dbl>, STermRatio <dbl>, Investing <dbl>,
## #   ProfitMargin <dbl>, DebtEquity <dbl>

3. Looking for a Linear Relationship

We’re going with a model that assumes linearity, so it makes sense to first check whether there’s any apparent relationship between the response variable and the predictors.

n_ratios <- annual_data %>%
  select(-c(stock, sector, PriceChange)) %>%
  length(.)


par(mfrow = c(3, 3))

for(i in 1:n_ratios) {
  
  second_col <- colnames(annual_data)[i+3]
  data <- annual_data[, c("PriceChange", second_col)] %>%
    na.omit()
  
  plot(data$PriceChange, data[,2],
       xlab = "PriceChange", ylab = second_col,
       main = paste("PriceChange vs", second_col),
       col = "steelblue4")
}

par(mfrow = c(1, 1))

From these plots, it looks like there’s no meaningful relationship — linear or nonlinear — between any of the predictors and the response variable.

Let’s confirm that by checking the correlation matrix:

annual_data_no_na <- annual_data %>%
  na.omit()

M <- cor(annual_data_no_na[,3:12])

corrplot(M, method = "number")

As expected, the correlations are weak across the board. Nothing suggests a strong linear connection between any of the ratios and the price change (the first row/column). At the same time, we also get a sense of collinearity between predictors, and it appears there’s no serious multicollinearity either.

At this point, it probably wouldn’t make much sense to go with a standard linear model. Still, I’m going to fit the hierarchical model anyway, just to see if there’s any structure hidden underneath, especially at the sector level.

4. Bayesian Hierarchical Linear Regression

In this section, I’m going to present two models. The first one is a standard Bayesian hierarchical regression, where each predictor has its own coefficient for each sector. In other words, every sector gets a different estimate for how much each financial ratio affects price change.

As the results will show, the second model was ultimately the one I chose. But before jumping ahead, it’s important to note that the goal here is not prediction. Short-term stock price prediction is extremely difficult — if not impossible — due to the number of unpredictable and random factors involved. These include unseen macroeconomic dynamics, the fact that stock prices don’t always reflect a company’s current financial health, and the reality that markets are largely driven by demand, not just fundamentals. On top of that, global and national politics, media narratives, and even public sentiment about a company can all influence its price.

Because of this complexity, and since no strong linear relationships were found, I didn’t include any residual analysis. Instead, I focus on the model setup, diagnostics, and posterior predictive checks (PPC). At the end, I summarize the key takeaways.

Preparing the data for JAGS:

annual_data <- annual_data %>% 
  select(sector, PriceChange, Inflation, 
         IncomeChange, IncomeRatio, EquityChange, 
         Payability, STermRatio, Investing, 
         ProfitMargin, DebtEquity)

annual_data <- annual_data %>%
  na.omit()

Some sectors don’t have enough companies to support reliable sector-level coefficients.

low_company_sectors <- annual_data %>% 
  group_by(sector) %>%
  summarise(count = n()) %>%
  arrange(-desc(count)) %>%
  filter(count %in% 1:4) %>%
  pull(sector)

low_company_sectors

## [1] "ASSET MANAGEMENT COMPANIES"                       
## [2] "FINANCIAL LEASING AND FACTORING COMPANIES"        
## [3] "INFORMATION AND COMMUNICATION"                    
## [4] "PROFESSIONAL, SCIENTIFIC AND TECHNICAL ACTIVITIES"
## [5] "AGRICULTURE, FORESTRY AND FISHING"                
## [6] "WOOD PRODUCTS INCLUDING FURNITURE"                
## [7] "ADMINISTRATIVE AND SUPPORT SERVICE ACTIVITIES"    
## [8] "CONSTRUCTION AND PUBLIC WORKS"

So, we group them under the label “OTHER”:

annual_data <- annual_data %>%
  mutate(sector = case_when(
    sector %in% low_company_sectors ~ "OTHER",
    TRUE ~ as.character(sector)
    ))

We assign a numeric ID to each sector. The sectors table will help us later interpret which ID corresponds to which sector:

sectors <- annual_data %>%
  mutate(sector_id = as.numeric(factor(sector))) %>%
  select(sector, sector_id) %>%
  unique(.) %>%
  arrange(-desc(sector_id))

Final dataset to be used:

data <- merge(sectors, annual_data, by.x = "sector") %>%
  select(-sector)

head(as_tibble(data), 5)

## # A tibble: 5 × 11
##   sector_id PriceChange Inflation IncomeChange IncomeRatio EquityChange
##       <dbl>       <dbl>     <dbl>        <dbl>       <dbl>        <dbl>
## 1         1      -0.545    0.370        -1.33       -0.888       -1.15 
## 2         1       0.511    0.371         0.583      -0.242        0.506
## 3         1       1.75    -0.0603       -0.608      -0.518       -0.203
## 4         1      -0.105   -0.0771       -0.574      -0.377       -0.124
## 5         1      -0.367    0.237        -2.27       -1.35        -0.940
## # ℹ 5 more variables: Payability <dbl>, STermRatio <dbl>, Investing <dbl>,
## #   ProfitMargin <dbl>, DebtEquity <dbl>

Initial Model

The first model allows every sector to have its own coefficient for every predictor. The idea is that if a coefficient is large (in absolute value), that predictor has a stronger effect on price change for that sector.

Likelihood:

\[ y_i \sim \mathcal{N}(\mu_i, \sigma^2), \quad \mu_i = \alpha + \sum_{k=1}^K \beta_{k, s_i} x_{ik} \]

Priors:

\[ \beta_{k, s_i} \sim \mathcal{N}(\mu_{\beta_k}, \tau_{\beta_k}^2), \quad \alpha \sim t_3(0, 1), \quad \sigma \sim t_3^+(0, 1) \]

Hyperpriors:

\[ \mu_{\beta_k} \sim t_3(0, 1), \quad \tau_{\beta_k} \sim t_3^+(0, 1) \]

$y$: The response (PriceChange)
$i$: Observations
$s_i$: The sector ID for observation $i$
$k$: Predictors.

I started with a simple Gaussian likelihood to see how well the model explains price changes across different sectors.

For the intercept, I used a weakly informative Student-t prior with 3 degrees of freedom. It gives the model enough room to move the overall average up or down, especially if some sectors are very different from others. It’s not overly constraining but still reins things in a bit to avoid overfitting.

For the standard deviation, I used a half Student-t prior, basically a t-distribution that only allows positive values. Since standard deviation can’t be negative, this makes sense. Again, I kept it weakly informative so the model is free to learn whatever noise level the data supports, whether small or large.

Each sector-specific coefficient (for every predictor) has a normal prior. This reflects the idea that sectors might respond differently to a financial ratio, but those effects are still drawn from the same overall distribution — unless the data says otherwise.

The global means for those coefficients have Student-t priors, which are more robust to outliers than normal distributions. If a predictor has no clear relationship with price change, the model will shrink it toward zero. But if there is a strong effect in a specific sector, the heavier tails of the t-distribution will allow for it.

Finally, the variation across sectors (how much coefficient values differ between sectors) is modeled using half Student-t priors for standard deviations. These allow the model to adapt. If there’s little between-sector variation, the model learns that, if there’s a lot, it adjusts accordingly.

Model Setup in JAGS

mod_string <- "model{

  for(i in 1:length(PriceChange)){
    PriceChange[i] ~ dnorm(mu[i], prec)
    mu[i] <- intercept + 
      B_IncomeChange[sector_id[i]] * IncomeChange[i] +      
      B_IncomeRatio[sector_id[i]] * IncomeRatio[i] + 
      B_EquityChange[sector_id[i]] * EquityChange[i] + 
      B_Inflation[sector_id[i]] * Inflation[i] + 
      B_Payability[sector_id[i]] * Payability[i] + 
      B_STermRatio[sector_id[i]] * STermRatio[i] + 
      B_Investing[sector_id[i]] * Investing[i] + 
      B_ProfitMargin[sector_id[i]] * ProfitMargin[i] + 
      B_DebtEquity[sector_id[i]] * DebtEquity[i]            
  }
  
  
  #Prior on intercept
  intercept ~ dt(0, 1, 3)

  #Prior on SD of PriceChange
  sigma ~ dt(0, 1, 3) T(0,)     #Truncating SD for larger than 0
  prec <- 1 / pow(sigma, 2)
  
  
  #Priors on coefs of sector-specific ratios
  for(j in 1:n_sector){
    B_IncomeChange[j] ~ dnorm(mu_B_IncomeChange, prec_B_IncomeChange)
    B_IncomeRatio[j] ~ dnorm(mu_B_IncomeRatio, prec_B_IncomeRatio)
    B_EquityChange[j] ~ dnorm(mu_B_EquityChange, prec_B_EquityChange)
    B_Inflation[j] ~ dnorm(mu_B_Inflation, prec_B_Inflation)
    B_Payability[j] ~ dnorm(mu_B_Payability, prec_B_Payability)
    B_STermRatio[j] ~ dnorm(mu_B_STermRatio, prec_B_STermRatio)
    B_Investing[j] ~ dnorm(mu_B_Investing, prec_B_Investing)
    B_ProfitMargin[j] ~ dnorm(mu_B_ProfitMargin, prec_B_ProfitMargin)
    B_DebtEquity[j] ~ dnorm(mu_B_DebtEquity, prec_B_DebtEquity)
  }
  
  #Hyperpriors on means of sector-specific ratio coefs
  mu_B_IncomeChange ~ dt(0, 1, 3)
  mu_B_IncomeRatio ~ dt(0, 1, 3)
  mu_B_EquityChange ~ dt(0, 1, 3)
  mu_B_Inflation ~ dt(0, 1, 3)
  mu_B_Payability ~ dt(0, 1, 3)
  mu_B_STermRatio ~ dt(0, 1, 3)
  mu_B_Investing  ~ dt(0, 1, 3)
  mu_B_ProfitMargin ~ dt(0, 1, 3)
  mu_B_DebtEquity ~ dt(0, 1, 3)
  
  
  #Hyper-priors on standard deviations of sector-specific ratio coefs
  sigma_B_IncomeChange ~ dt(0, 1, 3) T(0,)         
  prec_B_IncomeChange <- 1 / pow(sigma_B_IncomeChange, 2)
  
  sigma_B_IncomeRatio ~ dt(0, 1, 3) T(0,)         
  prec_B_IncomeRatio <- 1 / pow(sigma_B_IncomeRatio, 2)
  
  sigma_B_EquityChange ~ dt(0, 1, 3) T(0,)         
  prec_B_EquityChange <- 1 / pow(sigma_B_EquityChange, 2)
  
  sigma_B_Inflation ~ dt(0, 1, 3) T(0,)         
  prec_B_Inflation <- 1 / pow(sigma_B_Inflation, 2)

  sigma_B_Payability ~ dt(0, 1, 3) T(0,)         
  prec_B_Payability <- 1 / pow(sigma_B_Payability, 2)
  
  sigma_B_STermRatio ~ dt(0, 1, 3) T(0,)         
  prec_B_STermRatio <- 1 / pow(sigma_B_STermRatio, 2)
  
  sigma_B_Investing ~ dt(0, 1, 3) T(0,)         
  prec_B_Investing <- 1 / pow(sigma_B_Investing, 2)
  
  sigma_B_ProfitMargin ~ dt(0, 1, 3) T(0,)         
  prec_B_ProfitMargin <- 1 / pow(sigma_B_ProfitMargin, 2)
  
  sigma_B_DebtEquity ~ dt(0, 1, 3) T(0,)         
  prec_B_DebtEquity <- 1 / pow(sigma_B_DebtEquity, 2)
}"


params <- c("intercept", 
            "B_IncomeChange", "B_IncomeRatio", "B_EquityChange",            
            "B_Inflation", "B_Payability", "B_STermRatio", 
            "B_Investing", "B_ProfitMargin", "B_DebtEquity",
            "mu_B_IncomeChange", "mu_B_IncomeRatio", "mu_B_EquityChange",
            "mu_B_Inflation", "mu_B_Payability", "mu_B_STermRatio", 
            "mu_B_Investing", "mu_B_ProfitMargin", "mu_B_DebtEquity", 
            "sigma_B_IncomeChange", "sigma_B_IncomeRatio", "sigma_B_EquityChange",
            "sigma_B_Inflation", "sigma_B_Payability", "sigma_B_STermRatio", 
            "sigma_B_Investing", "sigma_B_ProfitMargin", "sigma_B_DebtEquity",
            "sigma")


data_jags <- c(as.list(data), list(n_sector = length(unique(data$sector_id))))

model <- jags.model(textConnection(mod_string), data_jags, n.chains = 3)    # 3 chains to check convergence

Running the model. It might take some time.

update(model, 10000)                  # [1]

mod_sim <- coda.samples(model, params, n.iter = 50000, thin = 5)     # [2]    
mod_csim <- as.mcmc(do.call(rbind, mod_sim))

[1] Burn-in
[2] A small lag is applied to avoid autocorrelation.

Note: My posterior samples are in /Financial-Ratio-Analysis/Samples

mod_sim <- readRDS("/Financial-Ratio-Analysis/Samples/mod_sim.rds")       
mod_csim <- as.mcmc(do.call(rbind, mod_sim))

Convergence Check

The following code generates convergence and density plots for each parameter in the posterior chains:

plot(mod_sim)

However, plotting all samples for every parameter can be computationally heavy on the computer (at least on mine), since there are many sector-specific coefficients and three chains. Altogether, this results in tens of thousands of posterior samples.

Instead, you can inspect only the most recent part of the chains. Here’s how to extract and plot the last 1,000 samples (per chain):

mod_sim_last <- window(mod_sim, start = end(mod_sim) - 1000 * 5)  
plot(mod_sim_last)

Below, you can see the traceplots for just three randomly chosen parameters to keep the presentation concise (Last 5000 posterior samples per chain). This is just to give a general feel for how well the chains mix:

From the traceplots, the chains seem to have mixed well. There is no signs of divergence or stuck values.

To back this up, we can use the Gelman-Rubin diagnostic:

gelman <- gelman.diag(mod_sim)

gelman_df <- data.frame(param = rownames(gelman$psrf),
                        PointEstimate = gelman$psrf[, 1],
                        UpperCI = gelman$psrf[, 2],
                        Converged = gelman$psrf[, 2] < 1.05)

sum(gelman_df$Converged == "FALSE")

## [1] 0

All chains passed the Gelman-Rubin diagnostic. The upper confidence limits are below 1.05, which suggests convergence.

Next, we’ll look at effective sample sizes, which tell us how many independent samples the chains actually give us, after accounting for autocorrelation.

Effective Sample Size

To make reliable inferences from the posterior samples, an effective sample size (ESS) of at least 1,000 is generally recommended. Since I’m planning to report credible intervals, I’ve aimed for an ESS above 2,000 whenever possible.

efsize <- effectiveSize(mod_csim)

efsize[which(efsize < 2000)]

##    mu_B_EquityChange   sigma_B_DebtEquity sigma_B_EquityChange 
##               1699.9               1898.0               1822.9 
## sigma_B_IncomeChange 
##               1910.4

All parameters show an ESS above 1,500, which is reassuring. Only four parameters fall below the 2,000 mark. Three of those are global standard deviations, which isn’t a big concern since they’re higher-order hyperparameters and usually don’t require as much precision.

However, the global mean coefficient for EquityChange has an ESS of around 1,700. This is still quite reasonable, but it’s worth keeping in mind when interpreting results.

Inference

We’ve already checked that the model has converged and that the effective sample sizes are sufficient. Now, it’s time to turn to the summary statistics to answer the big question: Do the financial ratios I used actually explain changes in stock price?

To evaluate this, we look at the Highest Posterior Density (HPD) intervals for the global mean coefficients (mu_B_). These represent the average effect of each ratio across all sectors. If an HPD interval excludes 0, it suggests there’s credible evidence for a real effect. If the interval includes 0, the data doesn’t support a directional relationship.

hpd <- HPDinterval(mod_csim)
hpd <- as.data.frame(hpd) %>%
  rownames_to_column(var = "parameter") %>%
  rowwise() %>%
  mutate(significance = !between(0, lower, upper))

hpd[str_detect(hpd$parameter, "^mu_B") & hpd$significance == 1,] %>%
  as.data.frame()

##          parameter   lower   upper significance
## 1 mu_B_IncomeRatio 0.01307 0.40532         TRUE

From this output, we see that only the global mean of IncomeRatio has an HPD interval that excludes zero. So, on average across sectors, only this ratio appears to have a credible effect on PriceChange.

Let’s now examine which specific sectors show this effect:

hpd[str_detect(hpd$parameter, "^B_IncomeRatio") & hpd$significance == 1,] %>%
  as.data.frame()

##          parameter   lower  upper significance
## 1 B_IncomeRatio[4] 0.16593 1.0309         TRUE

Here, we find that IncomeRatio is significant only in sector ID 4. To decode that:

sectors[sectors$sector_id == 4,]$sector

## [1] "EDUCATION HEALTH SPORTS AND ENTERTAINMENT SERVICES"

These results aren’t surprising. As mentioned earlier, the exploratory plots and correlation matrix already hinted at a lack of strong relationships. Still, this kind of structured modeling helps uncover where effects might exist, even if they’re subtle or sector-specific.

Before making any changes to the model, let’s take a quick look at the posterior means for the global means (mu_B_) and their corresponding global standard deviations (sigma_B_), which reflect how much each ratio’s effect varies across sectors.

summary_stats <- summary(mod_csim)[1] 

summary_stats <- summary_stats$statistics %>%
  as.data.frame() %>%
  rownames_to_column(var = "parameter")

summary_stats %>%
  filter(str_detect(parameter, "^mu_B|^sigma_B")) %>%
  mutate(predictor = str_remove(parameter, "^mu_B_|^sigma_B_"),
         param = case_when(
           str_starts(parameter, "mu_B_") ~ "mu_B",
           str_starts(parameter, "sigma_B_") ~ "sigma_B")) %>%
  select(predictor, param, Mean) %>%
  pivot_wider(names_from = param, values_from = Mean) %>%
  arrange(desc(sigma_B)) %>%
  as.data.frame()

##      predictor      mu_B  sigma_B
## 1  IncomeRatio  0.211599 0.253627
## 2   STermRatio -0.059052 0.153085
## 3 IncomeChange -0.021633 0.139074
## 4   DebtEquity -0.019327 0.115021
## 5   Payability  0.009165 0.108603
## 6    Investing -0.061637 0.093118
## 7    Inflation -0.063212 0.092583
## 8 ProfitMargin -0.012359 0.090295
## 9 EquityChange -0.070108 0.081386

Only IncomeRatio has both a relatively large global mean and a higher global standard deviation. This suggests that its effect not only exists, but varies meaningfully across sectors.

We’ll keep this in mind when building our second model. After we run the Posterior Predictive Check (PPC), the plan is to allow only IncomeRatio to vary by sector, while treating the other predictors as fixed (non-hierarchical) effects.

Posterior Predictive Check (PPC)

The Posterior Predictive Check (PPC) is a core step in Bayesian model evaluation. The idea is simple: if the model has captured the true data-generating process, then it should be able to generate data that looks like what we actually observed. If the generated data and the real data don’t match, it means the model is likely missing something important, and any inferences we draw from it could be misleading.

We start by extracting the posterior draws from our model and filtering out parameters we don’t need for prediction (hyperparameters like mu_B_ and sigma_B_):

n_data <- length(data$PriceChange)
n_post_samples <- nrow(mod_csim)

post_samples <- as.matrix(mod_csim)
exclude <- str_subset(colnames(post_samples), pattern = "^(mu_B_)|(sigma_B_)")  
post_samples <- post_samples[, !colnames(post_samples) %in% exclude]  # We don't need hyperparameters

Next, we extract the relevant columns for each parameter to use in predictions:

col_int <- which(colnames(post_samples) == "intercept")
col_sigma <- which(colnames(post_samples) == "sigma")

col_IncomeRatio <- which(str_detect(colnames(post_samples), "B_IncomeRatio"))
col_IncomeChange <- which(str_detect(colnames(post_samples), "B_IncomeChange"))
col_DebtEquity <- which(str_detect(colnames(post_samples), "B_DebtEquity"))
col_EquityChange <- which(str_detect(colnames(post_samples), "B_EquityChange"))
col_Inflation <- which(str_detect(colnames(post_samples), "B_Inflation"))
col_Investing <- which(str_detect(colnames(post_samples), "B_Investing"))
col_Payability <- which(str_detect(colnames(post_samples), "B_Payability"))
col_ProfitMargin <- which(str_detect(colnames(post_samples), "B_ProfitMargin"))
col_STermRatio <- which(str_detect(colnames(post_samples), "B_STermRatio"))

For each posterior draw, we generate predicted responses using the model’s structure and simulate new outcomes:

replicated_arrays <- matrix(NA, nrow = n_data, ncol = n_post_samples)

for(i in 1:n_post_samples){
  samples <- post_samples[i,]
  
  intercept <- samples[col_int]
  sigma <- samples[col_sigma]
  
  mu <- intercept + 
    samples[col_IncomeRatio][data_jags$sector_id] * data_jags$IncomeRatio + 
    samples[col_IncomeChange][data_jags$sector_id] * data_jags$IncomeChange +
    samples[col_DebtEquity][data_jags$sector_id] * data_jags$DebtEquity +
    samples[col_EquityChange][data_jags$sector_id] * data_jags$EquityChange +
    samples[col_Inflation][data_jags$sector_id] * data_jags$Inflation +
    samples[col_Investing][data_jags$sector_id] * data_jags$Investing +
    samples[col_Payability][data_jags$sector_id] * data_jags$Payability +
    samples[col_ProfitMargin][data_jags$sector_id] * data_jags$ProfitMargin +
    samples[col_STermRatio][data_jags$sector_id] * data_jags$STermRatio 
  
  replicated_arrays[,i] <- rnorm(n_data, mean = mu, sd = sigma)
  
}

We now compare the actual observed data (PriceChange) with simulated datasets:

y <- data_jags$PriceChange
t_replicated_arrays <- t(replicated_arrays)
p_s <- sample(1:n_post_samples, 100)

ppc_dens_overlay(y, t_replicated_arrays[p_s,])

From the density overlay, it’s clear that the replicated samples do not closely match the actual data. Specifically, the observed data shows a sharper peak near the mean than any of the simulated samples, a sign that the normal likelihood we assumed doesn’t adequately capture the shape of the response distribution.

This misfit suggests that our model is too misspecified. It’s likely overestimating the variance and underestimating the peakedness of the real data. As a result, our predictions are flatter and less concentrated than they should be.

The model fit is poor under the current assumptions. While our hierarchical structure is useful, the assumption of normally distributed errors appears to be too restrictive for this particular dataset. Next, we’ll revise the model.

Second Model

In the previous section, we diagnosed the main limitation of the first model: it failed to capture the true variability in the data. The observed distribution of PriceChange had sharper peaks and bumps in tails, which is not what a normal distribution could accommodate. In other words, the Gaussian likelihood was too rigid and couldn’t represent the deviations present in real market behavior. To address this, we’ll make two important changes to the model:

1. Student-t likelihood Rare but extreme movements aren’t well captured by a normal distribution. A Student-t distribution, on the other hand, has heavier tails and can better accommodate these large deviations. It gives us a robust likelihood, allowing the model to see and adjust for outliers instead of being overly influenced by them or washing them out entirely.

2. Change in Hierarchical Structure In the first model, we assigned a full set of sector-specific coefficients to every predictor. However, our results showed that only one predictor IncomeRatio had a notable global effect and varying sector-level effects. The other predictors did not exhibit meaningful heterogeneity across sectors. So in this revised model, we’ll retain the hierarchical structure only for IncomeRatio allowing its coefficient to vary by sector. The rest of the predictors will be treated as fixed effects.

Model Setup in JAGS

mod2_string <- "model{

  for(i in 1:length(PriceChange)){
    PriceChange[i] ~ dt(mu[i], prec, df)     
    mu[i] <- intercept +  
      B_IncomeChange * IncomeChange[i] + 
      B_IncomeRatio[sector_id[i]] * IncomeRatio[i] + 
      B_EquityChange * EquityChange[i] + 
      B_Inflation * Inflation[i] + 
      B_Payability * Payability[i] + 
      B_STermRatio * STermRatio[i] + 
      B_Investing * Investing[i] + 
      B_ProfitMargin * ProfitMargin[i] + 
      B_DebtEquity * DebtEquity[i]            
  }
  
  
  #Prior on intercept
  intercept ~ dt(0, 1, 3)

  #Prior on SD of PriceChange
  sigma ~ dt(0, 1, 3) T(0,)     #Truncating SD for larger than 0
  prec <- 1 / pow(sigma, 2)
  
  #Priors on coefs of global ratios
    B_IncomeChange ~ dt(0, 1, 3)
    B_EquityChange ~ dt(0, 1, 3)
    B_Inflation ~ dt(0, 1, 3)
    B_Payability ~ dt(0, 1, 3)
    B_STermRatio ~ dt(0, 1, 3)
    B_Investing ~ dt(0, 1, 3)
    B_ProfitMargin ~ dt(0, 1, 3)
    B_DebtEquity ~ dt(0, 1, 3)
  

  #Priors on coefs of sector-specific ratios
  for(j in 1:n_sector){
    B_IncomeRatio[j] ~ dnorm(mu_B_IncomeRatio, prec_B_IncomeRatio)
  }
  
  
  #Hyperpriors on means of sector-specific ratio coefs
  mu_B_IncomeRatio ~ dt(0, 1, 3)
  
  
  #Hyper-priors on standard deviations of sector-specific ratio coefs
  sigma_B_IncomeRatio ~ dt(0, 1, 3) T(0,)         
  prec_B_IncomeRatio <- 1 / pow(sigma_B_IncomeRatio, 2)
  
  df ~ dgamma(2, 0.1)
}"


params2 <- c("intercept", 
            "B_IncomeChange", "B_IncomeRatio", "B_EquityChange", 
            "B_Inflation", "B_Payability", "B_STermRatio", 
            "B_Investing", "B_ProfitMargin", "B_DebtEquity",
            "mu_B_IncomeRatio", "sigma_B_IncomeRatio",
            "sigma", "df")

model2 <- jags.model(textConnection(mod2_string), data_jags, n.chains = 3)

Let’s compare two models using Deviance information criterion (DIC)

dic1 <- dic.samples(model, 5000)
dic2 <- dic.samples(model2, 5000)      

dic <- sum(dic$deviance) + sum(dic$penalty)
dic2 <- sum(dic2$deviance) + sum(dic2$penalty)

dic2 - dic

## [1] -44

The second model has considerably lower DIC score.
Running the model.

update(model2, 10000)

mod2_sim <- coda.samples(model2, params2, n.iter = 50000, thin = 5)
mod2_csim <- as.mcmc(do.call(rbind, mod2_sim))

Note: My posterior samples are in /Financial-Ratio-Analysis/Samples

mod2_sim <- readRDS("/Financial-Ratio-Analysis/Samples/mod2_sim.rds")       
mod2_csim <- as.mcmc(do.call(rbind, mod2_sim))

Convergence Check

Traceplots for just three randomly chosen parameters to keep the presentation concise (Last 5000 posterior samples per chain).

From the traceplots, the chains seem to have mixed well. There is no signs of divergence or stuck values.

Gelman-Rubin diagnostic:

gelman <- gelman.diag(mod2_sim)

gelman_df <- data.frame(param = rownames(gelman$psrf),
                        PointEstimate = gelman$psrf[, 1],
                        UpperCI = gelman$psrf[, 2],
                        Converged = gelman$psrf[, 2] < 1.05)

sum(gelman_df$Converged == "FALSE")

## [1] 0

All chains passed the Gelman-Rubin diagnostic.

Effective Sample Size

efsize <- effectiveSize(mod2_csim)

efsize[which(efsize < 2000)]

##    B_IncomeRatio[4] sigma_B_IncomeRatio 
##             1245.00              987.15

Most parameters are above the 2,000 threshold, but we’ll be a bit cautious when interpreting B_IncomeRatio[4], which falls below 2,000.

Inference

We now examine the 95% HPD intervals for all B_ (regression coefficients) parameters. If the interval excludes 0, that suggests a credible effect on PriceChange.

hpd <- HPDinterval(mod2_csim)
hpd <- as.data.frame(hpd) %>%
  rownames_to_column(var = "parameter") %>%
  rowwise() %>%
  mutate(significance = !between(0, lower, upper))

hpd[str_detect(hpd$parameter, "^mu_B|^B_") & hpd$significance == 1,] %>%
  as.data.frame()

##          parameter      lower     upper significance
## 1 B_IncomeRatio[3]  0.0083746  0.418276         TRUE
## 2 B_IncomeRatio[4]  0.0450694  0.869783         TRUE
## 3 B_IncomeRatio[8]  0.0151967  0.322725         TRUE
## 4      B_Investing -0.1586990 -0.020882         TRUE
## 5 mu_B_IncomeRatio  0.0219553  0.287102         TRUE

From the results, IncomeRatio shows a credible sector-level effect in sector IDs 3, 4, and 8. Additionally, the fixed coefficient for Investing appears to have a negative effect across all sectors.

To identify the sector names:

sectors[sectors$sector_id %in% c(3, 4, 8),]$sector

## [1] "CHEMICALS, PETROLEUM RUBBER AND PLASTIC PRODUCTS"  
## [2] "EDUCATION HEALTH SPORTS AND ENTERTAINMENT SERVICES"
## [3] "HOLDING AND INVESTMENT COMPANIES"

Note: While interpreting the results, it’s important to note that the coefficient for IncomeRatio in sector 4 (B_IncomeRatio[4]) was found to have a credible effect, its HPD interval excludes zero. However, the effective sample size for this parameter was around 1,000, which is the bare minimum for stable inference. Although convergence diagnostics didn’t raise red flags, the relatively low ESS suggests that the estimate may still be somewhat uncertain. This doesn’t invalidate the finding, but it does warrant some caution when interpreting the strength or robustness of the effect in this particular sector.

PPC

Let’s check how well the second model fits the data.

Filtering out parameters we don’t need for prediction (hyperparameters like mu_B_ and sigma_B_):

n_data <- length(data$PriceChange)
n_post_samples <- nrow(mod2_csim)

post_samples <- as.matrix(mod2_csim)
post_samples <- post_samples[, !colnames(post_samples) %in% c("mu_B_IncomeRatio", "sigma_B_IncomeRatio")]

Next, we extract the relevant columns for each parameter to use in predictions:

col_int <- which(colnames(post_samples) == "intercept")
col_sigma <- which(colnames(post_samples) == "sigma")
col_df <- which(colnames(post_samples) == "df")

col_IncomeRatio <- which(str_detect(colnames(post_samples), "B_IncomeRatio"))
col_IncomeChange <- which(colnames(post_samples) == "B_IncomeChange")
col_DebtEquity <- which(colnames(post_samples) == "B_DebtEquity")
col_EquityChange <- which(colnames(post_samples) == "B_EquityChange")
col_Inflation <- which(colnames(post_samples) == "B_Inflation")
col_Investing <- which(colnames(post_samples) == "B_Investing")
col_Payability <- which(colnames(post_samples) == "B_Payability")
col_ProfitMargin <- which(colnames(post_samples) == "B_ProfitMargin")
col_STermRatio <- which(colnames(post_samples) == "B_STermRatio")

For each posterior draw, we generate predicted responses using the model’s structure and simulate new outcomes:

replicated_arrays <- matrix(NA, nrow = n_data, ncol = n_post_samples)
  
for(i in 1:n_post_samples){
  samples <- post_samples[i,]
  
  intercept <- samples[col_int]
  sigma <- samples[col_sigma]
  df <- samples[col_df]
  
  mu <- intercept + 
    samples[col_IncomeRatio][data_jags$sector_id] * data_jags$IncomeRatio + 
    samples[col_IncomeChange] * data_jags$IncomeChange +
    samples[col_DebtEquity] * data_jags$DebtEquity +
    samples[col_EquityChange] * data_jags$EquityChange +
    samples[col_Inflation] * data_jags$Inflation +
    samples[col_Investing] * data_jags$Investing +
    samples[col_Payability] * data_jags$Payability +
    samples[col_ProfitMargin] * data_jags$ProfitMargin +
    samples[col_STermRatio] * data_jags$STermRatio 
  
  replicated_arrays[,i] <- mu + sigma * rt(n = n_data, df = df)
  
}

We now compare the actual observed data (PriceChange) with simulated datasets:

y <- data_jags$PriceChange
t_replicated_arrays <- t(replicated_arrays)
p_s <- sample(1:n_post_samples, 100)

ppc_dens_overlay(y, t_replicated_arrays[p_s,]) +
  coord_cartesian(xlim = c(-4, 4))

From the density overlay, we can see that the replicated samples now match the actual PriceChange data much more closely. The model captures both the central peak and the tails of the distribution reasonably well.

This suggests that our revised modeling choices of switching to a Student-t likelihood and applying hierarchical structure only where it was warranted (on IncomeRatio) have significantly improved the fit. The model now better reflects the true variability and shape of the observed data, giving us greater confidence in the inferences we draw from it.

With this improved fit, we can move on to summarize the final takeaways and reflect on what this analysis reveals about the relationship between financial ratios and stock price change.

Final Thoughts

To wrap things up, I didn’t go into this project expecting miracles. Most of the financial ratios I use are ones I came up with over time because they made sense to me, not because they were academically tested. And given how unpredictable and sometimes irrational the Turkish stock market can be, I wouldn’t have been surprised if none of them showed any effect at all.

In fact, during this particular period, the market was especially unstable and more prone to manipulation than usual. So I already had low expectations about finding strong, consistent relationships between fundamentals and price changes.

Still, it was interesting to see that IncomeRatio, a ratio I personally give a lot of weight to, turned out to have a statistically credible effect in a few sectors. That both reassures me and makes me question my own judgment a bit. Also, the second model suggested that Investing has a slight negative effect overall, which might reflect how investors react to aggressive capital spending when things feel uncertain.

There are definitely limitations here. A proper analysis would usually need multiple time periods, basically panel data, and probably a much longer holding period than just one year. Outliers distort things a lot when you’re zoomed in. It also feels likely that different market conditions (like economic boom vs. recession) need different models altogether.

Even then, this kind of modeling will always be limited in a market like Turkey’s. There aren’t that many stocks, and many of them are from small companies with low liquidity. On top of that, prices are heavily affected by news, social media, momentum, and sometimes just random demand surges. No matter how well you model fundamentals, those things won’t show up in the data unless you explicitly include them.

All in all, I’m glad I did this. Even if the results weren’t groundbreaking, I got to test my own strategy in a more disciplined way.

Financial Statement Analysis Using Bayesian Methods

Taha Kerem Akyol

Overview

Introduction

Financial Statements

Data

1. Stock List and Sector Classification

2. Financial Statements

3. Historical Market Prices (in TRY)

Exploratory Analysis of Sector and Market Cap Based Stock Behavior

a) Data Preperation

b) Company Count per Sector

c) Price Change by Sector

d) Introducing Company Size

e) Price Change by Company Size

1-Range

2-Density Check

3-Median Comparison

4-Bayesian Bootstrap Test for the Difference in Medians

5- Risk/Reward

Statistical Analysis and Bayesian Data Modeling

1. Financial Ratios

2. Calculating the Variables and Setting Up the Data

3. Looking for a Linear Relationship

4. Bayesian Hierarchical Linear Regression

Initial Model

Convergence Check

Effective Sample Size

Inference

Posterior Predictive Check (PPC)

Second Model

Convergence Check

Effective Sample Size

Inference

PPC

Final Thoughts