Australian (ASX) Stock Market Forum

Adventures in AI

Joined
29 June 2007
Posts
217
Reactions
336
I've spent the last rainy Melbourne week exploring AI and whether or not it could be usefull in trading.

This is something that I've attempted before, but given up as i didn't have the time. I still don't have the time, but I do have chatGPT:cool: what could go wrong !!!.

My system/setup is :
- Visual Studio 2022.
- Python.
- TensorFlow (GPU Version).
- Norgate Data Platinum.

Last week I did set all this up and got everything working to run a simple test. WHY does setting up anything have to be soooo damn hard. NOTHING installed correctly, modules that should work didn't etc, trying to get the TensorFlow to use the GPU.... This was probably the hardest step of all and took over 2 days. ChatGPT:vomit: was more useless than ever here.

This rainy week, I spent arguing with ChatGPT:mad:, but i initially managed to get a model built and trained on AAPL data. It was really only after i ditched chatGPT:xyxthumbs and started exploring the code it gave me that i was finally able to build something that worked and could be expanded on.

My initial testing is on the Nasdaq100. I have now trained a simple network on the Nasdaq100 from 2000 up until the start of 2021. My simple strategy was that the AI would buy the 5 best stocks and sell them in a week. SO, it had to pick the 5 stocks with the highest probability of the highest returns for 1 week and then sell. To get a reasonable sample it did this every day, ie every day it bought the 5 best stocks and sold them 5 days later, so on any given day it had 25 parcels.

Here's the results of my latest Bot that i've run on the NDX for 2021 (ie, data it wasn't trained on).
- NDX Index UP 26%
- @Nick Radge TLT UP 13%
- AI Bot was UP 43% (with a number of anomalies, probably to do with the data importing or public holidays)

** Note i may have some delisted bias as i haven't incorporated that code yet, this is all still a test **

While i have NO definitive answer yet and the initial testing was mixed between encouraging and disheartening, the more i play the more i wonder if the Bot can trade better than me.

Time to switch off for the weekend.
 
I've spent the last rainy Melbourne week exploring AI and whether or not it could be usefull in trading.

This is something that I've attempted before, but given up as i didn't have the time. I still don't have the time, but I do have chatGPT:cool: what could go wrong !!!.

My system/setup is :
- Visual Studio 2022.
- Python.
- TensorFlow (GPU Version).
- Norgate Data Platinum.

Last week I did set all this up and got everything working to run a simple test. WHY does setting up anything have to be soooo damn hard. NOTHING installed correctly, modules that should work didn't etc, trying to get the TensorFlow to use the GPU.... This was probably the hardest step of all and took over 2 days. ChatGPT:vomit: was more useless than ever here.

This rainy week, I spent arguing with ChatGPT:mad:, but i initially managed to get a model built and trained on AAPL data. It was really only after i ditched chatGPT:xyxthumbs and started exploring the code it gave me that i was finally able to build something that worked and could be expanded on.

My initial testing is on the Nasdaq100. I have now trained a simple network on the Nasdaq100 from 2000 up until the start of 2021. My simple strategy was that the AI would buy the 5 best stocks and sell them in a week. SO, it had to pick the 5 stocks with the highest probability of the highest returns for 1 week and then sell. To get a reasonable sample it did this every day, ie every day it bought the 5 best stocks and sold them 5 days later, so on any given day it had 25 parcels.

Here's the results of my latest Bot that i've run on the NDX for 2021 (ie, data it wasn't trained on).
- NDX Index UP 26%
- @Nick Radge TLT UP 13%
- AI Bot was UP 43% (with a number of anomalies, probably to do with the data importing or public holidays)

** Note i may have some delisted bias as i haven't incorporated that code yet, this is all still a test **

While i have NO definitive answer yet and the initial testing was mixed between encouraging and disheartening, the more i play the more i wonder if the Bot can trade better than me.

Time to switch off for the weekend.
please make sure your hardware doesn't generate ( random ) errors , and also triple check your random number generator , which in modern computers is normally software generated ( Pseudo Random Number Generator ) , unfortunately erroneous input data is hard to detect quickly ( essential if you need rapid execution of transactions )

AI will tend to compound introduced errors

good luck
 
Hi Divs,

What sort of random errors ?

I would hope that TensorFlow generates any random numbers from a clock seed(CPU/GPU or RTC).

It is definitely something i should look at though, especially since in a Neural Network it would be impossible to find.
 
Hi Divs,

What sort of random errors ?

I would hope that TensorFlow generates any random numbers from a clock seed(CPU/GPU or RTC).

It is definitely something i should look at though, especially since in a Neural Network it would be impossible to find.
anything

at one stage western digital turned off the error-checking on their hard drive cache ( so they could get faster read speeds for Microsoft X-boxes i accidentally introduced an error into a system by placing a DVD-burner/rewriter too close to the RAM ( only took me two days to diagnose the issue ) things like inadequate cooling/air-flow on hot days even unshielded hard-drive cables

back a few years back there were tools available to seriously test the hardware ( for non-software) flaws and was shocked at which equipment had problems and where ( especially in ex-commercial servers) don't forget any add in cards ( errors and cooling issues )

since you are running networks shielded ethernet/network cables as well ( it could even be a low-flying plane if near an airport , and the cables act as an antenna )

but first you need to spot ( any ) error , that is tiresome and then track it down and diagnose

now the modern GPU MIGHT provide a True Random Number Generator , surprisingly few motherboards have a TRNG , in a bid to cut costs and power consumption

when WD turned off the error-checking on their hard-drives the logic was it was faster to throw out/replace a bad ( game ) frame than correct the damaged one ( and the gamer would barely notice the flicker in an intense video-game )

HOWEVER that brand of hard drive ( at the time ) was also widely used by serious computer geeks and professionals

cheers
 
pseudo random numbers were ( when i was really into computers ) generated by using the Fibonacci sequence as the base for the software routine ( once you knew that cracking is much easier )
 
When i started this project (again) i was under NO illusions that it would be difficult and probably fail. I also thought that it'd be a whole big waste of time, which is what has put me off starting again for so long.

I'm actually surprised at just how quickly i seemed to get some sort of working model happening. This week i'll keep tinkering with the model i have and i'm hoping that i can start answering some of the BIG questions.

For almost all time periods the model exceeded the index (sometimes considerably), so that was encouraging, except of course for the latest time period.

Model that trained from 2010->2020 over the current market 2023-> present
Nasdaq = 33.38%
AI = 32.05% ** maybe i should train the model up until 2023 ? .... OR has the market changed too much this year.

Given the models simple approach (buy 5 best and sell in 7 days), tracking close to the index is encouraging.

1696803942105.png


Good Day
1696804137867.png

Bad Day
1696804642842.png
 
I had a setback yesterday.

I finally found that lurking future looking input in the network 🤬. I'd like to blame ChatGPT, but really it was my inexperience with Python.

The giveaway was when i looked at yesterdays post and started thinking about and investigating MRVL. MRVL had the best week ever that week, it was too good to be true.

MOVING ON !!!

So yesterday with my testing (after removing the future leak) i retrained up to 2021. I noticed that as my test period got further away from the training period, the worse the results were getting. Initially in 2021 i was keeping up with the index.... good, however In 2023 i was so far behind the index a monkey could have thrown darts and got a better result. BAD2

2021 NDX UP 26% .... Ai UP 26% relative 0%
2022 NDX DN 33% .... AI UP 6% relative +39%
2023 NDX UP 36% ... AI DN 1.36% relative -37.36%

I really don't think it's learning anything meaningfull yet.

NOTE: This is still a relatively small model with simplistic rules trained to EOY 2020. The goal at the moment is to set up a testing environment so i can measure how effective changes are to the network.

1696891985083.png
1696893114260.png
1696895944153.png
 
Thanks for the support @Warr87.

Like i stated, I'm under no illusions that this could be a waste of time.

I'm also of the opinion that day by day, more people are going to try this now that the resources are so easily available. So there should be a thread where people can start asking questions.

For now I'm happy just creating a blog of trials and errors !!!.
 
ive followed along a tutorial before doing some AI on gold futures. so there are some resources out there. i'm sure more will come available as more get involved. though i think the edge will disappear by then. but theoetrically you could also run similar models on some large etf's, groups of indexes, groups of futures, or do it all on individual stocks as well. so plenty of room for ideas. i think once you find the right kind of learning/training model, and a good balance of IS/OS material, etc., it'll become easier. also, how often do you tune/opt your model? AI models drift...always hard questions to answer even for Data Scientists.
 
I'm sure they already are well ahead.

I'm currently training on 10 years of nasdaq 100 EOD. It's far too small a dataset to gain anything meaningfull. The network learns within about 3-5 epochs, which means there's not a lot of information to be learned from such a small dataset or a small network. Increasing the size of the network helps considerably, but increasing the size can lead to overfitting and a big increase in training time.

My inputs into the network are
- RSI(2) ->144 following the fibonnaci sequence.
- c/MA(2) -> 144 ^^^^^^^^^^^^
- Bollinger Bands upper and lower. Also adjusted for price.
- Stochastic
- ROC(2) ->144
- ATR2->144/C
- OBV

There are around 90 different inputs going into the model. That could be too many or too few, I really don't know yet.

My outputs are quartiles Q10, Q50, Q90. ie
Q10 = 10% off the time the price is below the Q10 and 90% above.
Q50 = 50% below and 50% above.
Q90 = 90% below and 10% above.

I'm not sure how you could compute something like this in a classical algo, maybe linear regression or even adaptive bollinger bands ?

Currently i'm just using the Q50 with the greatest return while i build the infrastructure, but by calculating the quartiles you can basically build a distribution of estimated returns vs risk. This is where AI could be beneficial in classical mechanical trading programs.
 
90 seems like a lot. Though, maybe it's not? probably fewer indicators too, as a lot of them just express similar things but in slightly different ways. I think maybe finding something that is highly correlated but quicker reacting could be an attempt at a leading indicator. I wonder if, say, changes in FX could be something? Why? FX tends to change quickly to world events, news, etc. Though maybe I'm completely off. Don't know. I know some have tried to use seasonal data for weather to help predict direction of agriculture futures. Maybe a data input that is not price could help aid in training?
 
ive followed along a tutorial before doing some AI on gold futures. so there are some resources out there. i'm sure more will come available as more get involved. though i think the edge will disappear by then. but theoetrically you could also run similar models on some large etf's, groups of indexes, groups of futures, or do it all on individual stocks as well. so plenty of room for ideas. i think once you find the right kind of learning/training model, and a good balance of IS/OS material, etc., it'll become easier. also, how often do you tune/opt your model? AI models drift...always hard questions to answer even for Data Scientists.
Not sure how often i should train, potentially each weekend, but this is also something to test. ie, compare results on a model trained each year vs each month vs each year and work out a point.

It also leads to how far back should it be trained, Is a rolling window of 5 years better than a start point of 2000. How valid is the old data vs newer data OR do we train from 2000 to see how it handles 2000,2008,2011.

My current thinking is to test a 5Y rolling window. and shorten the periods, ie, 1 year, 6months between training sessions to see if there's improvements.
 
90 seems like a lot. Though, maybe it's not? probably fewer indicators too, as a lot of them just express similar things but in slightly different ways. I think maybe finding something that is highly correlated but quicker reacting could be an attempt at a leading indicator. I wonder if, say, changes in FX could be something? Why? FX tends to change quickly to world events, news, etc. Though maybe I'm completely off. Don't know. I know some have tried to use seasonal data for weather to help predict direction of agriculture futures. Maybe a data input that is not price could help aid in training?
90 is a lot. My inputs need a LOT of work, it's what i'm doing today actually. I'm culling them down to perhaps just the series of RSI(), ROC() and maybe V/MA(V) then build or cull from there to see if it helps.
 
Not sure how often i should train, potentially each weekend, but this is also something to test. ie, compare results on a model trained each year vs each month vs each year and work out a point.

It also leads to how far back should it be trained, Is a rolling window of 5 years better than a start point of 2000. How valid is the old data vs newer data OR do we train from 2000 to see how it handles 2000,2008,2011.

My current thinking is to test a 5Y rolling window. and shorten the periods, ie, 1 year, 6months between training sessions to see if there's improvements.
i think that's a good plan, with the rolling windows.
 
90 is a lot. My inputs need a LOT of work, it's what i'm doing today actually. I'm culling them down to perhaps just the series of RSI(), ROC() and maybe V/MA(V) then build or cull from there to see if it helps.
RSI and ROC are good. Including volume I think would be useful. And a volatility measure would also be useful as well, I think. Which vol meausre you use will probably depend on the data set (that is, vol measure for indexes such as ASX300, or S&P500? etc)
 
RSI and ROC are good. Including volume I think would be useful. And a volatility measure would also be useful as well, I think. Which vol meausre you use will probably depend on the data set (that is, vol measure for indexes such as ASX300, or S&P500? etc)
Interesting and not surprisingly:. You face the same problem as we do building system by hand and optimising parameters
Do not over fit, take volatility into account, which realm , what about volume and not just price, etc
I am 100 certain AI can help , but by hiding part of the process, it can be even more dangerous than the manual big prone approach where we are least try to understand what is done.
Having faith with this approach and substantial of your own money involved will not be easy in chaotic times.
Well done and following with great interest
 
RSI and ROC are good. Including volume I think would be useful. And a volatility measure would also be useful as well, I think. Which vol meausre you use will probably depend on the data set (that is, vol measure for indexes such as ASX300, or S&P500? etc)
volatility ?? in the current scenario where there are reports of an increase in trading zero days ( to expiry)options at the expense of other option trading

just asking the question is the current volatility index as indicative as it has been in the past


good luck
 
Interesting and not surprisingly:. You face the same problem as we do building system by hand and optimising parameters
Do not over fit, take volatility into account, which realm , what about volume and not just price, etc
I am 100 certain AI can help , but by hiding part of the process, it can be even more dangerous than the manual big prone approach where we are least try to understand what is done.
Having faith with this approach and substantial of your own money involved will not be easy in chaotic times.
Well done and following with great interest
Not to mention future leaks.

It was encouraging to know that the model could learn to use future leaking data, even just one column of it and produce unrealistically good results. Just like AB.

Hiding part of the process: This is an interesting topic. My Neural Network has 100's of millions of connections and weights. While you can view internally the structure, there is no way to analyse or predict what's going on under the hood, you can only really measure the performance of the outputs. You can never know how it will react to any particular change in input.

How much would you trust it with your money ? .. Within the next few years it will be driving most new cars.
 
So this morning i managed to Import into AB. While python has a lot of libraries like Zipline, i still like AB reports.

AI Vs NDX 2021 - 2022 - 2023

1697077646720.png


1697077514466.png
 
Top