in reply to Bayesian not-for-spam

I should start off by saying that I own a company that does analysis on the stock market and provides that technical analysis to subscribers and also trades on portfolios based on that information.

When I first got into Perl, it was because of an encryption problem I was working on (the Poe Cipher). With that, I learned about Markov Matricies, Bayesian analysis of language, and how they related to Perl.

Having been obsessed with the stock market since 4th grade, I immediately tried to think of ways that Bayesian type analysis could be used to predict the stock market. I knew that I wouldn't be the first to have thought of it, but I wanted to resolve if it was feasible or not.

If you are going to do it - you will have very little success putting in information into it and having it say that N days from now the price will be X. You will have slightly better success having it output and say that over the next N days, the price will go up/down Y percent. And assuming you have the right code, you will have relatively decent accuracy with it saying that the market will go up/down in the next N days.

I have since moved on to neural nets and genetic algorithms that mix traditional trading methods with non-linear analysis that we don't necessarily intuitively grasp on our own (most things that we have to work out and are consciously mathematically in daily life are linear).

The amount of computing power that is involved is a bit much though - I use Perl and ForkManager. That then iterates over some data and feeds in file names (ticker symbols) into a C program which then runs - as those build up, the cluster I have gets fed the programs to execute. I analyze thousands of days of data, thousands of times in a row with the C program and each node gets it done in about a second. There are thousands of tickers just in the US markets. So on a single processor, that is still going to take some time to churn through. And that is still a relatively basic system that I have - I have more complex code that I have in the works that is likely going to take 5 times as long to run.
That said, I have written some more basic scripts that are actually very fast in Perl (largely due to the help of Memoize since there is a lot of analysis on the same data over and over in loops), and they are showing that they might actually be more useful than the neural nets and the like.

Do keep in mind that you don't want to have the code know everything that has happened in the past - otherwise it will tell you what it would have done back then. The stock market changes - you want it to figure out a generalized rule that it can follow that is right N% of the time (where N is sufficiently high to make you money, or rather, not lose you money) on as little data as possible - that way as times change, that rule should still work well even though the environment has changed.
Also, you don't want to feed in dates. That will help somewhat in that it will learn when earnings reports are and if it is a good system, it will learn to stay out of the market at that time. But for the most part, you don't want to feed in raw data - you want to run it through some normalization functions first - squash it down. If it learns what to do when the price is 56, but then 3 years later the price is 13, the program doesn't know what to do.
So you want it to analyze the numbers so that they are always in a normalized range and act accordingly on that.
It is also worthwhile to determine what stocks move with or against the stock that you are looking at. If KO goes up, does PEP tend to go up to, or go down?
That adds a tremendous amount of data on top of the problem of analysis.

For those that say that the market is random - I would say that there are many out there that are perfectly happy making money off of what they see as non-random.
People on either side stand to benefit from being right in that assertion.
I personally hope to start a hedge fund within the next ten years if they haven't gone under due to overregulation by then. Until then, I will be making money doing what I do now.

(also I will add that a neat thing you will notice is that formulas that will aid you in analysis are formulas that will work in many different disciplines - hence why so many investment banks were hiring physicists back in the early '90s - things that work in physics and heat flow work in currency trends, work in stock market movement at monthly or 5 minute bars - the main difference to note is with weather. In weather you will see that you can make a prediction and it has no effect at all on the outcome. I could say it will rain and everyone in the area will put on a raincoat - that in itself won't make it rain or not rain. But in the stock market, depending on what analysis you are doing - some are more easily broken than others - by pointing something out and acting on it, you in effect break the system and it will behave differently.)


-------------------------------------------------------------------
There are some odd things afoot now, in the Villa Straylight.
  • Comment on Re: Bayesian not-for-spam (re: stock Market)