0%

【Factor analysis】 Vol. 1. Introduction the idea of factor analysis

In previous posts, we’ve been talking about the momentum strategies like MACD strategy and turtle strategy. These are the standard strategies based on the hypothesis that when the environmental variables, such as price moving average, technical indicators, and other subjective numbers, meet certain requirements, the stock price would follow the trends and continue to go up/down. The scholars believe that the momentum would continue pushing the stock price further if such investor confidence has been established.

Other than that, there is also another methodology out there, called factor analysis. To explain this in detail, I’m going to separate this huge topic into several parts:

  1. Introduction of the idea of factor analysis
  2. Get ready: preparing and cleaning data
  3. Construct your own factors
  4. Factor scores
  5. Performance analysis

What is factor analysis

I’m not going to intimidate you with lengthy and smart-ass sentences with tons of big words. Otherwise, I’m going to drive away the last group of the audience that I have left. Let me try to tell you a story then you’ll get to know the objective of factor analysis.

Story starts here

Long long time ago…

In a small country somewhere on the earth, there’s a king who is the kindest and the most merciful one this country ever had. The kind king has ruled this country for decades.
One day, an idea struck the king: after all the hard work my team and I have done, are my people feel happy about our doings? And how are we going to keep or improve the level of happiness of our people? So the king has assigned this task to the Minister of the Realm.

Easy enough, the Minister has designed a survey to investigate the degree of happiness of the people. They have collected the following information of every individual in the realm: gender, age, occupation, location, income, marital status, and the number of social gatherings per week. At the bottom of the survey, the Minister also requests each individual to evaluate the degree of happiness of himself/herself. He used Excel to put together all the survey and would like to present this sheet to the king. Here’s how this excel sheet looks like:

Degree of Happiness (0~100) Gender Age Occupation Location Income Marital Status Number of social gathering per week
60 male 28 miner village A $2,000 single 2
80 female 32 housewife village B $100 married 5
77 male 63 retired village C $500 divorced 12
58 female 22 OL village A $1,300 in a relationship 7



This is what we called “data collection”. We collect abundant data regarding a question or a specific phenomenon that we’re trying to understand and explain. So here the question we’re trying to understand is that how happy the people in the country are, and how we can further increase the satisfaction of the people by increasing income level, promoting more social events among people, or even by adding a moat. All we need to do is to dig into the data and discover valuable and meaningful interpretation.

The next step…

At the home of the Minister…

Minister: Ummmm….

The Minister was contemplating how to put together an actionable plan and to present this to the king. He can calculate the average degree of happiness of everyone and tell the king how happy his people are? That probably gonna cost him his job if he presents this sloppy report. But what else could he get from this sheet full of numbers?

The son of the Minister, who majored in statistics in college, happened to hear his father’s moaning. He patted on his father’s shoulder and told his father that two magic words would save his brain juice: “Factor Analysis”

The foundation of the factor analysis is simple and can be summarized into this one-line formula:

This is the formula that we all are acquainted with when we’re in junior and senior high. But now we’re giving the actual meaning to this formula. The degree of happiness is the dependent variable, and the rest are independent variables or “factors”. After mapping the terminology, it would make this formula more sense to you:

where

  • $b_1, b_2, … b_n$ are coefficients, could be any natural number (positive/negative)
  • $\alpha$ stands for the part of degree of happiness that cannot be explained by given factors

Therefore, if we can decipher these coefficients in the formula, we’re able to find out which factor contributes the most to the level of happiness of the people in this country. For example, let’s assume the coefficient of income level is 0.008. This indicating that for every $1,000 extra income you gain, your degree of happiness gain 8 points.

More work to do, then the happy ending

Minister: Cool! Finally, my money paid off….., I mean, finally you’ve learned something useful. But here comes the question, how can we get those coefficients from the data we have collected from the survey?

The son of the Minister showed him several matured tools existing in the market. They fed the data into the program and get the approximated numbers of those coefficients. The Minister is very happy to present this new finding to the king…

That’s the end of the story. After all, we’re not writing a novel here. At least we get the general idea of what factor analysis and usage from the story, hopefully.

Factor analysis application of stock market

Factor analysis not only can be applied to survey analysis, but also to the post-marketing campaign, financial model, business plan, online ads traffic, and even the mobile app traffic analysis. As these events and incidents have the tendency of having one major objective and collecting lots of data according to the inputs and outputs, they would be the perfect subjects to apply factor analysis and understand what happened after the program launched. Once we would be able to condense into the formula as we described above, it would give us the knowledge to interpret the collected data and probability to predict the future.

Let’s step back and take a look at what would happen if we apply factor analysis to the stock market. We now use daily return as our dependent variable, and we get to choose any indicator that could potentially contribute to the stock growth or decline of the stock price. In a nutshell, we’re trying to understand how much of the price change of today can be explained by the picked factors yesterday(stock price, trading volume, or company-wise fundamental…). So in the end, the formula would look like this:

By having this formula as the result of you performing factor analysis, you’ll be able to:

  • explain the historic price movement of the stock market and become a famous scholar.
  • or predict tomorrow’s stock price using historic data and make a profit out of it.

How exciting this could be!! But be aware, that factor analysis itself is merely deducting and extrapolating the coefficients through the data we’ve been given. There are pitfalls we need to pay attention to, such as

  1. biased data were provided
  2. duplicated data or missing data
  3. false assumption
  4. overfitting to an optimal situation to maximize the return
  5. simply luck

So don’t get too excited when you achieved a good result after performing factor analysis. You need to further exam each step over and over again until you inspect this model from all angles.


That’s all

That’s pretty much about the introduction of what factor analysis. I hope I make it clear enough, and we’ll start walking through the implementation step-by-step starting the next series.

Enjoy reading? Some donations would motivate me to produce more quality content