Programming Neural Networks and Fuzzy Systems in FOREX Trading

Programming Neural Networks and Fuzzy Systems in FOREX Trading Presentation 3 Balázs Kovács (Terminator 2), PhD Student Faculty of Economics, University of Pécs E-mail:kovacs.balazs.ktk@gmail.com Dr. Gabor Pauler, Associate Professor Department of Information Technology Faculty of Science, University of Pécs E-mail: pauler@t-online.hu

Content of the Presentation • ArtificialNeuralNetworks: • Java Object Oriented Network Environment (JOONE): • Installation • Sampleapplication: Vinitaly • Data Stream-basedGraphicUserInterface • Building blocks of JooneNeuralModel • Neuron Layers • SynapseSets • SpecialLayers • Plug-inModules • Usage of JooneNeuralModel: • Network building • AssemblingLearningModule • Training of Network • Post-TrainingActivities • Activation of the Network • ForecastingNeuralModelsinJoone • TemporalAssociativeMemory (TAM) • Jordan-typeRecurrentBackpropagation • Elman-typeRecurrentBackpropagation • Forecasting Trend TurningPoints • ForecastingTrends • BackpropagationTrough Time (BPTT) • Time DelayNeuralNetwork (TDNN) • References

Joone: Installation • Java Object Oriented Network Environment (JOONE) developed by Paolo Marrone is an Integrated Development Environment, IDE (Integrált fejlesztőkörnyezet) of ANNs using Data Stream-based Graphic User Interface, GUI (Adatáram-alapú felhasználói felület): •  It enables rapid prototyping an application of ANN in industrial/ business problems •  It is an open source Java-based application •  Under GNU-license, it can be used free for non-business application •  However it shows nothing from the internal working of topologies, so it requires more experienced developer •  It is less flexible in terms of topology and learning methods compared to other ANN modeling tools (eg. SNNS, JNNS) being more close to biologic analogy but requiring more development time • Installation of Joone: • First, install latest version of Java Runtime Environment, JRE (Java futtatási környezet): http://www.java.com/en/download/index.jsp • Make sure that your Win7 security settings allowed JRE installer to write JRE path into PATH variable of Win7!!! If not, write it manually at Control panel|System|Special settings|Envi-ronment button|PATH|Edit button • Then, download file JooneEditor2.0.0RC1-All.zip http://sourceforge.net/projects/joone/files/joone-editor/2.0.0RC1/ and extract it to C:\Joone directory • Run C:\Joone\RunEditor.bat to launch Joone editor. Click Click

Joone: Sample Application 1 • The BNL Multiservizi S.p.A (http://www.bnl.it) is a dynamic investment bank financing small enterprises in Northern Italy and main sponsor of Vinitaly (http://www.vinitaly.it), the largest wine expo of Italy organized yearly in Verona. We can see database of 178 wines, classified by jury (Zsűri) in categories 1:Ex-cellent..3:Average.It contains the following properties for each wine: • Alcohol % • Citric Acid content • Natural chlorides • Sugars • Ripe day number • Potassium Acid content • Pectine dissolving Enzymes • Dry material content • Sulphates • Stabilizers • Wholesale retail price in 3-1Sample.xls Excel file • Let’s build a 3-layer perceptron ANN model with Joone which estimates category membership of wine from its properties

Joone: Data Stream-based Graphic User Interface • InFile|Openwecanopen*.ser file (3-1Sample.ser) containing ANN model of theproblem, wherewewillsee a Block diagram(Blokk diagramm) representingData streams (Adatáram-lás) in ANN. It is verycompactdescriptionevennotshowingtopologydetailed:  Soitismuch more easytooverviewatlarge, multi-layerednetworks,  butrequiresmoreadvan-cedknowledge of ANN theorytounderstandit: • Blocks (Dobozok) can be: datasources (eg. Excel Input) ortheirpre-filters (eg. Normalizer), I/O interfaces (eg. LearningSwitch), neuron fields (eg. Linear, Sigmoid), Training- (eg. Teacher) and diagraming modules (eg. Chart) • BlocksareconnectedwithArrows (Nyilak): arrowsdenotedwithlettersD, F, S, Kmeaninter-fieldsynapticsets, un-denotedarrows show simpleinformation flow • Objectsin a diagram can be PulledfromtheToolbar and editedwithRightClick|Menu: • Properties:editproperties of thegivenobject • Delete:deleteobject • Inspect:seeactualdatastre -amingintheobject (eg. at synapticsetsyoucansee synapticweights) • Line, Rectangle and othergra- phictools has no function, they justhelpvisuallyenchancethe diagram • LearningMethod is notselected manually, butit is determinedby whichobjects webuildtogether Pull Click Click Right Click

Joone: Building blocks of Neural Model: Neuron Layers 1 • Neuron Layers (Neuronmezők)are Java objectswhichcan be runon more CPU inMulti-Thread(Párhuzamos feldolgozás).Inthenetworkmodeltheyareblocksshowingtheirtypename, width (number of neurons). Theycanworkin 2 modes: • Activationmode (Aktivációs üzemmód):neuronsaggregatexInput Vector (Input Vektor) multiplicatedwithWWeightMatrix (Súlymátrix) of incomingsynapseset, thenbBiases (Jelzési konstans)of neuronsareaddedtotheproduct. Thesesumsaretransformedbyf(x)SignalFunctions (Jelzési függvény) of neuronsintoyOutput Vector (Outputvektor) • BackpropagationLearning (Backpropagációs tanítási üzemmód):neuronsaggregateDydifferencebetweenyactual and y*desired output vectormultiplicatedwithWWeightMatrixával (Súlymátrix) of outgoingsynapseset, thenbbiases of neuronsareaddedtotheproduct. Thesesumsaretransformedbyf-1(x)inverse of signalfunctionofneurons, creatingDhweightmodifiervector. This is multiplicatedwithWweightmatrix of incomingsynapsesettogetDxwhich is thedifferencebetweenxactual and x*desiredinputs. (Inthepractice, instead of f-1(x)inverse, 1/f’(x)reciprocfirstorderpartialderivative is used) • Types of NeuralLayersinJoone: • LinearLayer (Lineáris mező):y = b × x (3.1) It is usedonlyas input layer, merelyrescalingsignal • BiasLinearLayer (Konstansos lineáris mező):y = x + bn (3.2) Atbackpropagation, itadjustbias of input neuronswithDxinputdifference • SigmoidLayer (Szigmoid mező):y = 1/ (1+e-x) (3.3) General hiddenor output layerwith[0,1] output signalrange • TangenthiperbolicLayer (Tangenshiperbolikus mező): y = (ex-e-x) / (ex+e-x) (3.4) General hiddenor output layerwith[-1,+1] output signalrange • LogarithmicLayer (Logaritmikus mező):y = ln(1+x) |x ≥ 0 és y = ln(1-x) |x < 0 (3.5) Itslows down aggregatedvaluesgrowthtoinfinity, additionallyitincreasessmallsig- nalstoamplifytheirdifferencetoreasonablelevel • Sine Layer (Szinuszos mező):y = sin(x) (3.6) It is usedtodetectperiodiccycles

Joone: Building blocks of Neural Model: Neuron Layers 2 • Delay Layer (Késleltető mező): yt = xt-T (3.7) It is a temporal buffer of neurons’ signal in T time periods called Taps (Lépés). It has N×(T+1) neurons and output, forming sliding time window on time series • Context Layer (Kontextus mező): yt = b × (x + yt-1 × w) (3.8) It feeds back intra-field from output to input, with w<1 weight, creating passively decaying memory, it is used in Bidirectional Associative Memory (Bidirekcionális Asszociatív Mem.) • Basic Gauss Layer (Alap haranggörbe mező): y = exp(-x2) (3.9) It signals near value 0 . It is used to transform monotonic functions into non-monotonic • RBFGaussianLayer (Radiális bázisfüggvény mező): y = exp(-(x-a)2/VAR(x)) (3.10) It signals relative Euclidean proximity to parameter a • WinnerTakeAll, WTA (Győztes mindent visz): yn* =1 | f(xn*)max, else 0 (3.11) It is used in Kohonen Self Organizing Map, SOM (Kohonen-féle alakfeltérképezés): • Neurons of a layer form 2-or more dimensinal Space grid (Térrács), where they have constant grid coordinates (eg. 2nd row, 4th column), additionally they have changing coordinates in decision space as wn weight vectors of incoming synapses • In activation, the n* winner neuron will have the closest wn weight vector to x input vector by D(wn,x) Euclidean distance • In training, only n* winner neuron learns modifying its wn* weights towards x input vector • Parametered Gaussian Layer (Gauss-féle mező): y = exp(-(D(x-w))2/(VAR(x)×e)) (3.12) It is used in Kohonen Self Organizing Map, SOM (Kohonen-féle alakfeltérképezés): • The same as above, except that there can be more winners who are enough close to input by the bell-shaped Gaussian relative proximity formula • During e number of epochs, it narrows down proximity to focus on closest winners • In training, more winners can modify their weights towards the input vector • Nested ANN (Beágyazott neurális hálózat mező): It embeds a different type of network as a single layer of the current network. This is the only way how you can make custom learning method networks in Joone (otherwise learning method is dictated by topology). You can determine whether nested ANN is trained with main net or just takes part in activation

Joone: Building blocks of Neural Model: Synapse Sets, Input Layers • Synapseset (Mezőközi szinapszis halmaz)object: memorycontainingWweightmatrix (toseeit: RightClick| Inspectmenu) which is usedsharedbythe 2 connected neuron layerobjects: previouswrites here its output (whichgetsweighted), thenextaggregatesinputsfrom here. It is denotedby an arrowwithletter: Itstypesare: • DirectSynapse (Direkt szinapszis): 1:1relation of 2 layersneurons • FullSynapse (Teljes szinapszis): n:m relation of layerswithneurons n and mformingtheirCarthesianProduct (Descartes-szorzat) • DelayedSynapse (Késleltetett szinapszis): itusesFiniteImpulseResponse, FIR(Véges impulzusú válasz) filter totransfer and smoothsignalsintime. It is usedattime series • KohonenSynapse (Kohonen szinapszis):Itcanconnect a linear input layer and a WTA or Gauss output layertocreateKohonen-SOM (seePresentation6) unsupervisedlearningnetwork: decisionvariableneurons of input layerareconnectedtodecisionspacegridneurons of output layer • Sanger Synapse (Sanger-féle szinapszis): Itconnects a wider input layerwithnarrower output layerperformingunsupervisedlearningequivalentwithFactorAnalysis (Faktor-analízis) instatistics: itreducesnumber of variablestryingtopreservetheirinfos • Input LayerobjectshandlinginputdatainJoone: • File Input Layer: itreads text file withdatabase-tablestructure, wherecolumnsareonlynumericvariablesdelimitedwith „;”-char, rowsareobservations, no columnnamesinrow 1 • URL Input Layer: itdownloads text filesintheformatabovefromthegivenURL • Excel Input Layer: itreadsdatfrom Excel file giveninFilename, Worksheetproperties: fullcolumns of worksheetwill be variables, rowsareobservations • JDBC Layer: itdownloadsdatafromanydatabasemanageraccessibletroughJDBC driver Drivername: name of driver, dbURL: path of database, SQLQuery: queryin SQL • Yahoo FinanceLayer: itdownloadsstockpricedatafromhttp://finance.yahoo.com/ • Image Input Layer: itreadsJPGformat image, usedat SOM networks • Memory Input Layer: itreadsdatafrom a pointerinmemoryputtherebyotherprograms • Input Connector/ Switch: itsharesthesameinputamongseveralnetworks, sowedon’t havetoloaditseveraltimesseparately

Joone: Building blocks of Neural Model: Output Layers, Special Layers e Dw • Output layerobjectshandlingoutputdatainJoone: • Output Switchcanswitchamong more possibleoutputs, providedbyOutput Layers: theFile-writesin „;”-delimited text file, theExcel-, in Excel worksheet, theJDBC-in JDBC database, theImage-in JPG image: • TeacherLayer (Tanító mező) trainsthenetwork: • Itconnectstolastlayer of networkwithundenotedarrow, itgetsoutputsofthenetworkthisway • Moreoveritcan be connectedwithundenotedarrowto a LearningSwitchLayer, which is connectedwithInput Layers of training- and test samples of Crossvalidation (Keresztvalidáció), that’s howitgetsdesiredoutputs • Itcomparesestimatedoutputswithdesiredones, theirdifferenceswillcreateeerror vector, fromwhereitcomputesRelativeMeanSquaredError, RMSE (Egy output válto- zóra és egy megfigyelésre jutó átlagos négyzetes hiba) • Thiscan be diagrammedduringepochswith a Chart Output Layerconectedto teacherlayerwithundenotedarrow • Thenitshiftsnetwoklayersintolearningmode, and gives back errorvectorforthelast field, whichprocessesthem and modifiesWweights of itsincomingsynapseset. Then residualerrorsarepassedtopreviouslayer (thisprocess is repeatedatallprevious layersalso) • Teacherlayercanhandle 3 types of learningalgorithms: • Backprop (Sima backpropagáció):(seePres5)Itmodifiesweights aftereverysampleonce: Less memoryreq.Badconvergence • BatchBackProp (Kötegelt backpropagáció): (seePres5) within 1 epochitaggregatesrequestedweightchangesforallsamples and modifiesweightsonlyonceatthe end of eachepoch:  Requires more memory,  Betterconvergence • RPROP (Resziliensbackpropagáció): (seePres5) itmodifieswe- ightsbytheSign (Előjel) of requestedweightchangesbutwith a constant unit:  Convergence is slower,  Butdanger of stuck- ing intosuboptima is less

Joone: Building blocks of Neural Model: Plug-In Modules • Input Plug-Ins (Input mezőkhöz csatlakozó beépülők): theypreparevariables of input training- and validationsamples, togetloadedintonetworkthroughLearningSwitch: • CenterOnZero: setsaverage of variablesto0 • Normalizer: ittransformsvariablevalueslinearlyin[0,1]Closedinterval (Zárt intervallum) • TurningPointExtractor: extraxts local minima and maxima fromtime series • MovingAverage: smoothstime series averagingkelementsin a 1-steping timewindow • DeltaNorm: differences of neighboredelementsoftime series arenormalizedin[0,1] • Shuffler: itrepresentssamplesattrainingin random ordertodecreasepossibledistortingeffect of samplesequence • Binary: breaks an integer variableintoset of binaries • Output PlugIns(Outputmezőkhöz csatlakozó beépülők): • UnNormalizer: denormalizesnetworkdataintooriginal valueinterval of variables • Monitorobject: enablescontrolingthenetworkduringruntime and storesitsparameters: learningrates, firstorderpartialderivatives of signalfunctionsinbackpropagation, epochnumbers, etc. Monitor Plug-Ins (Monitor objektum beépülői) are: • LinearAnnealing: itcontinouslydecreaseslearningrateduringepochswith a %, forthebetterconvergence and stability of backpropagationatthe end of learning • DynamicAnnealing: ifRMSEerrordecresesduringpreviouskepochs, itincreaseslear-ningrate, asbackpropagationprobablychangesweightstowardsthe optimum, ifRMSEincreases, thenlearningrate is reduced, asbackpropagationprobablygoeswrongway. • Errorbasedtermination: enumber of epochs is not fixed inadvance, instead of ititstopswhenRMSEgoesunder a prdefinedlevel • OtherPlugIns (Egyéb beépülők): • ChartHandle: itdistributesdatastreamamong • more chartobjects • ChartLayer: uni-variable line chart • VisadChartLayer: multivariatelinechart • NeuralNetobject: itcompresses Java Classhierar- chyof a networkinsingleportable file

Joone: Using Neural Model: Building Network • Adding Excel Input Layer: we will use excel input data source • AdvancedSerieSelector=2,14:2th-14thcolums are input vars • FileName=3-1Sample.xls: name of Excel file • FirstRow=2,LastRow=131: rows read (1st is column names,132nd-179th will be validation sample) • Normalizer plugin:normalizes variable values into closed interval • AdvancedSerieSelector=1,13:do it for 1st-13th columns • (In,Out)Data×(Max,Min): max-/minimum of in/output data • A second normalizer for 132nd-179th rows, at validation sample • Learning Switch: it shifts between training and validation data source • Adding Linear Layer: it will be input field of 3-layer perceptron • Rows=13: number of input neurons, Beta=1: their constants • Adding synaptic sets:(Full,Direct,Sanger,Kohonen): add Full here • Loopback: is there any feedback, Inspect|Copy:you can copy here initial synaptic weights, if you do not start with random! • Adding further Sigmoid Layer: Rows=10:10 neurons wide, • FlatSpotConstant=0: inflexion point of signal function is at 0 • Adding Excel Output Layer to last network layer: FileName= • 3-1Output.xls, SheetName: where to save output variables • Adding UnNormalizer Plugin:denormalizes output in its original range • AdvancedSerieSelector=1: do it for 1stoutputvariable • (In,Out)Data×(Max,Min): max-/minimum of in/output data • Adding ChartHandleLayer: so many times as many series we show • Serie=1: column of series in the data • Red=0, Green=, Blue=200: RGB color code • BasicChartLayer: line chart of series • Serie: column number in chart handle • MaxXAxis, MaxYAxis: X, Y Axis max values • Resizeable: Diagram resizeable or not • Show: Show diagram or not (Default is No!)

Joone: Using Neural Model: Adding Training modules • Totrainthenetworkwehaveto add thefollowingmodules: • TeacherLayer: thiscontrolstraining • Basic ChartLayer: this is connectedtoTeacherLayertoplotdecreasing of RMSE duringepochs • LearningSwitch: thisloadstoTeacherLayerdesiredoutputs of supervisedlearning: • AlwaystheInput Layerconnectedtoitfirst is thetrainingsamplethesecondisthe test sample • AtbothExcel Input Layer, the Advanced ColumnSelectorpropertypicksupthe1stcolumnfrom Excel workshett, which is the output variable (quality of wine) butattraningsampleitreadsrows2nd-131stwhileat test sampleitreadsrows132nd-179th • WehavetoconnectthesameNormalizer Plug-intobothtraining and test sample aswedidatthe input, usingthesame settings!

Joone: Using Neural Model: Training Click • Training of network can be done at Tools|Control Panel: • BatchSize=0: how many observations to put in one batch at batch learning backpropagation? (0:no batch creation) • Learning=Yes: Should it learn or only estimate inputs? • LearningMode=0: learning method (0:BackProp, 1:Batch Backprop, 2:RProp) • LearningRate=0.2: learning rate • Momentum=0.3: Momentum is rate of smoothing weight changes during learning (0: no smoothing weight changes) • Pre-learning cycles: number of activation cycles initiali-zing network before learning. It is used at delay layer • SingleThreadMode: use only one processor,  it can run at any hardware,  but slower • Supervised: Is it supervised learning? It tries to make esti-mated outputs similar to desired outputs (if there is any) • Epochs: fixed number of epochs. If there is Error Based Termination Plugin, it can stop learning at preset RMSE • Training patterns: number of training patterns • UseRMSE: compute RMSE during runtime. It is important at Dynamic Annealing Plugin which continously updates Learning Rate inversely to RMSE change • Validation: use validation sample given at I/OLearning Switch to cross-validate estimation efficiency of network • Validation patterns: number of validation patterns • Run button: run, Pause: stop, Continue: resume running • If Teacher Layer has connected Chart Layer and its Show property is True, then during runtime you can see a diagram in separate window with RMSE! Click

Joone: Using Neural Model: Post-Training Activities • If there is runtime error, or RMSE does not decrease during learning, we can debug network in Tools (Eszközök) menu: • ToDo List: List of error messages in a separate window • Add noise: it „shakes” network weights adding small +/- random values. It can be used if RMSE went well for a while, then stuck. However, its repeated use can totally confuse up learning! • Randomize: it randomizes all network weights to restart learning from the beginning • Reset Input Streams: at delay layers/synapses, it resets time windows on time series at the starting position • Using the network estimating outputs to new inputs: In Input Layer.Filename property, give the name of the file which has the same format but contains new inputs with an empty output column • At Tools|Control Panel: • Epochs=1:1 epoch is enough to go through and estimate output variable(s) at all rows • Learning=No: no learning (as there is no desired output) • Validation=No: no validation (as there is no desired output) • Run button: run • Input and output variable(s) are written in a separate file given at Output Layer.Filename at all rows • Saving network in File menu: • Save as: saving network into *.SER file. This contains all objects from neural model (even non-functional graphic objects also) but it is not portable, it can be opened in Joone editor • Export Neural Net: saving network into *.SNET file, which does not contain non-functional graphic objects, but it is portable and can be called from other Java applications • Export as XML: saving network into *.XML file. It can be used as a function with input parameters and output return value in XML Click Click Click Click

+ × S + ×Si(xi) Si(xi) Joone: Using Neural Model: Activation of the Network • Here we show activation of our Wine example network: • First lineáris input layer loads next input sample, • Then it ads to its biases (they are all 0 now) • Signal outgoing from input layer is multiplicated with weights of 1st synapse set • Weighted signals are summed by hidden sigmoid layer, which ads its biases • Si(xi) sigmoid signals outgoing from hidden layer are multiplicated with weights of 2nd synapse set • The sigmoid output layer sums them up and adds its biases • Si(xi) sigmoid signals of output layer are written in output excel file. Variable values are normalized in [0,1] interval, which can be transformed into {1..3}categories using asimple excel function =ROUND(A1*2,0)+1

Joone: Forecasting Neural Models: Temporal Associative Memory 1 bi bi bi bi wki Si(xit) Si(xit) Si(xit) Si(xit) wik wki wki ui ui ui ui wik wik wki wki wki wik wik wik wii wii wii wii ai ai ai ai xit xit xit xit li li li li Sk Sk Sk Sk • Temporal Associative memory, TAM(Temporálisasszociatívmemória):bothinstockexchange/FOREX and intechnicalfield (eg. invoicerecognition) therecan be a problemrecognizingcomplexwaveformsresultingfrominteraction of differentfrequencymovementsembeddedintoeachother, and Forecasting (Előrejelez) it, learning a historicsampledatabase and makingExtrapolation (Előrevetítés) inthefuture • TAM is a BAM withonedimensional neuron field • Gettingnelementtimewindowsfrom an m elementhistorictime series sample • The windowslidingonitforwardone byoneint = 1..Ttimeperiods • Duringtraining, weslidethewin- dowon series ine=1..Eepochs • Inactivation, weletthe TAM overrunthe end of the series and computeextrapolatedva- luesbyitsowniteration • TAM is eqivalent of Autoreg- ression, AR (Autoregressziós) models of statistics, butnot hinderedbyMulticolinearity (Multikolinearitás).

Joone: Forecasting Neural Models: Temporal Associative Memory 2 • TAM network (see its model in: TAM.ser) has a Context Layer containing delayed self-connec-tion in time with Time Constant = 0,5 periods. • We load from Yahoo1 time serie, containing 200 daily close prices of TSCO.L stock. We use Nomalizer Plugin to normalize it into [0,1] interval. Normalized data is loaded by a Delay Layer which slides 5 periods on time serie (Taps =5) converting it to 6 parallel series: all inputs will be 6 day-time windows (x(t), x(t-1), x(t-2),…) sliding day by day. • Teacher Layer loads next day desired outputs (TSCO.L Next day data) and makes Hebb-type learning in intra-layer connections of Context Layer. So activation ofthe network synchronizes with any compound cyclic movements shorter than 6 periods

Joone: Forecasting Neural Models: Temporal Associative Memory 3 • The Linear Output Layer writes forecasted daily data through Excel Output Layer into TAMOutPut.xls file, and plots it to Estim Chart (blue curve: desired output, given by Facts Chart Handle, red curve: estimation, given by Estimate Chart Handle) • The RMSE Chart shows error during 100 epochs of learning (100× loading of 200 samples with 6 variables). We can see that TAM synchronizes with vibration of time series at around 40thepoch, but after that RMSE error cannot go under 0.051 (this is the average error made estimating next day stock price normalized into [0,1] interval). On the Estim Chart we can see its reason: TAM captures plus/minus waves of TSCO.L stock price in the right phase, but it overestimates the amplitude of the movement. This can be corrected with experimenting different settings of Context Layer.TimeConstant, Width and Tools|ControlPanel.LearningRate in more trials (change only 1 parameter at once!!!)

Joone: Forecasting Neural Models: Recurrent Networks who who who who bh bo bh bo So(xot) Sh(xht) So(xot) Sh(xht) uo uo uh uh ao ao ah ah xot xht xht xot lo lh lo lh wih wih wih wih Si Si Ph Ph bi bi Si(xit) Si(xit) ui ui ai ai xit xit li li Input Input • TAM is the most simple forecasting network, but it has very serious disadvantages: • It requires at least double lenght of its time window in time series as it has to synchronize itself in activation before it can forecast • Convergency problems: if waveforms in time window are too much similar, it can iterate among them very ambigously. We call it Crosstalk (Áthallás) among similar training patterns • It can give only continous output instead of more reliable Sell/Buyy/Do nothing decisions. As it iunsupervised learning it is hard to change outputs • These problems are addressed at Recurrent Backpropagation (Rekurrens visszacsatolás), which are supervised learning networks: • It is 3 layer perceptron, where input layer gets samples from a time window equaling its width sliding forward on a time serie. It has 2 versions: • In Jordan Networks (Jordan-féle verzió) output layer has equal width with input layer and feeds back into that directly • In Elman-Networks (Elman-féle verzió) hidden layer feeds back into itself • Both networks can be trained more easily and has better convergency than TAM • Theoretically, Jordan can learn more difficult temporal patterns as it feeds back from output layer • Elman is much more close to TAM, but as it has an extra output layer, it more easy to transform output.

Joone: Forecasting Neural Models: Recurrent Networks: Jordan-Network • Jordan network (see model: Jordan.ser) differs from normal backpropagation with a Sigmoid Output Layer which has a time delayed feedback (Loopback=Yes) towards the input Delay Layer felé (delay is made by Context Layer. inserted into feedback, set-ting its TimeConstant = 0.5) • Checking outputs in JordanOutPut.xls file and on Estim Chart we can conclude that in the first 100 epochs it does not learn very well (the RMSE stucks at 0.055, moreover on RMSE Chart we can see that it is even increasing and does not converge to optimum) • Setting learning parameters may help some-what to avoid it:

Joone: Forecasting Neural Models: Recurrent Networks: Elman-Network • Elman is development of Jordan (see its model in: Elman.ser), where hidden layer has self-connection (out of syntax reason, it is built from a Linear Layer, 2 Direct and 1 Full synapse sets) • Outputs written in ElmanOutPut.xls file has the lowest RMSE error until this point: 0.033, and this can be even lowered with more epochs

Joone: Forecasting Neural Models: Forecasting Trend Turning Points 1 • In most practical cases, forecasting trend Tur-ning points (Fordulópontok) is more important than exact currency pair prices: • We expect a future income in foreign cur-rency and want to make a Hedge (Fedezet) against unfavorable price change • We use a successful trading strategy based on other methods and we would like to forecast when it will fail because of major price fractal change • In Turn.ser file we can see a simple 3-layer perceptron with sliding time window in input layer, where we use Turning Point Extractor Plugin to emphasis trend turning points. The operation is just the opposite of Smoothing (Simítás) and Moving averaging (Mozgóátlagolás): • It considers turning point all price changes with sign shift greater than its MinChange-Percentage property: it will amplify them to the .Min and .Max of the given normalized value interval in Normalizer Plugin. • Be careful to connect Turning Point Ex-tractor with Normalizer first, then you can connect to Input Layer, else it won’t work!

Joone: Forecasting Neural Models: Forecasting Trend Turning Points 2 • Using TurningPointExtractor the 3 layer perceptron’s estima-tion written in TurnOutPut.xls file will have RMSE = 0.042 which is better than TAM and Jordan but worse than Elman • Estimation efficiency of percep-tron is highly influenced by width of hidden layer (eg. 15) Usually there is an optimal va-lue: too much or too few will give inferior result • Unfortunately there is no theory to set it, you have to make several training experiments!

Joone: Forecasting Neural Models: Forecasting Trends • In many trading strategies we are looking for relatively lon-ger trends instead of following momentarily changes. How-ever it is more difficult, because it requires extrapolation in longer time window, which requires better sample • The Trend.ser file shows an example that using Moving Average Plugin we can eliminate minor waves from the time serie shorter than number of periods preset in its Mo-ving Average property. This way the network becomes analog of AutoRegression on Moving Average, ARMA (Autóregresszió mozgóátlagokon) statistic method, but it avoids serious Multicolinearity (Multikolinearitási) prob-lems: in regression, input variables should be independent from each other. But content of 2 neighboured time win-dow slided by 1 unit is so similar, that it totally confuse up regression, and it requires auxiliary methods to select time windows, which are less dependent from each other • Outputs written in TrendOutPut.xls file have RMSE error of 0.014 estimating the trend without minor waves, which is better than Elman, even this was made by a simple 3 layer perceptron!

Joone: Forecasting Neural Models:BPTT1 bi bh bi bi bh bo bo bo bh Sh(xht) Sh(xht) So(xot) Sh(xht) So(xot) So(xot) Si(xit) Si(xit) Si(xit) uo uh uo ui ui uo ui uh uh ai ah ai ao ah ah ao ai ao xit xit xht xht xot xht xot xot xit li lh lh li lo lo lh lo li Sj Sj Sj Sj Sj Sj Sj Sj Sj • Backpropagation Through Time, BPTT(Időbeli visszacsatolásos tanulás) • It has more full feedforward layers with identical width. Each layer represents a time period in a t=1..f elements time window • There are full feedforward synapse sets connecting more distant layers instead of neighboured ones • Synaptic sets bridging the same distance of layers have identical weights • BPTT by definition works with multiple sample price/indicator time series: i=1..n neurons of any layer will get the yit element of itht=1..T time series during e=1..T-f epochs as input as set of time windows sliding forward on time series • During activation, differences of inputs from xite membrane values will form deltas of backpropagation learning. • Deltas are propagated backward during both immediate and bridging synaptic sets also • One BPTT network is analog with series of AutoRegression, AR (Autoregressziós) models from statistics, just avoiding severe multicolinearity problems • The longest recognizable wavelenght depends on number of the fields used •  BPTT requires more setup periods than 3-layer perceptron until it gets synchronized •  But BPTT has more safe learning conver-gency as input is directly loaded at each field in each periods

Joone: Forecasting Neural Models: Backpropagation Through Time 2 • The BPTT.ser network requires price time series of more stocks simultaneously, but demo version of Yahoo InputLayer cannot provide that, so we try it out with TSCO.L stock 1-6 series (Open, High, Low, Close...) • We use 6 layers denoted by (t, t+1, t+2…) to represent 6 element time windows. Their width is 6 as we have 6 series. They are connected by full feedforward Delay Sy-napses with Taps=1 to form time windows • Outputs written in BPTTOutPut.xls file show very bad RMSE=0.82 . We can conclude that 100 epochs was barely enough for the network just to get somewhat closer to ex-pected outputs, as the network has very long synchronization time compared to TAM. Also, because of random synaptic sets bridging distant periods, at the begin-ning BPTT output has a very strong smoothing effect, and smaller wavelets can be detected only from longer sample, using very large number of epochs. At BPTT it is very critical to find the optimal topology.

Joone: Forecasting Neural Models: Time Delay Neural Network 1 bo bo bo bh bh bh So(xot) So(xot) So(xot) Sh(xht) Sh(xht) Sh(xht) uh uo uh uo uh uo ah ah ah ao ao ao xht xht xht xot xot xot lh lh lh lo lo lo Sj Sj Sj Sj Sj Sj p bi bi bi bi bi bi bi bi bi Si(xit) Si(xit) Si(xit) Si(xit) Si(xit) Si(xit) Si(xit) Si(xit) Si(xit) ui ui ui ui ui ui ui ui ui Head & Shoulders t ai ai ai ai ai ai ai ai ai xit xit xit xit xit xit xit xit xit li li li li li li li li li Input Input Input Input Input Input Input Input Input • Time Delay NeuraINetworks,TDNN (Időbeli el- tolásoshálózat): itwasinventedtoeliminate BPTT’s synchronizationand oversmoothing problems, stillretainingtheability of recogni- zingcomplexwaveforms, likeHead&Shoulders signaling trend turning/pricefractalbreak • It has 2dimensionalinputlayercom- biningi=1..n input time series and t=1..felementtimewindowssliding onthem. Input layer has fullfeedfor- wardconnectiontowards a verywide 1or2dimensionalhidden layer. Weights of thesesy- napsesare tied toeachot- herineverykthperiod: wit = wi(t+k)wherek is the maxi- maldetectablewavelenght.Itenchan- cesconvergency of learning • Hiddenneuronssymbolizecross- effectsamongdifferentwave- formsindifferenttime series • Hidenneurons has fullfeedfor- wardconnectiontowards output layer of nneuronsproducing nextperiodpriceforecastfor ntime series (alternatively theycangivediscreteBuy/Sell/Donothingsignals) • The network is trainedwithconventional BP algorithm • One TDNN network is analog of series of AR modelsusingFactors (Faktor) asindependentvariablessetupfromoriginaltimewindowdatausingPrincipalCompo-nentAnalysis, PCA (Főkomponens elemzés) •  TDNN has fastsynchronization, but learningcanstuck, soitrequiressmalllearningrate and manyepochs

Joone: Forecasting Neural Models: Time Delay Neural Network 2 • Our TDNN network (see model in TDNN.ser) loads 6 time series from TESCO.L (Open, High, Low, Close...) and a Delay Layer extracts a 6 element of time window from them (Taps=5), so we have 6×6=36 input variables. These are loaded in a 3 layer perceptron, which has a very wide (90) hidden layer to capture all possible cross-effects among different waveforms in different time series. Width of output layer (6) equals number of estimated time series • Checking outputs written in TDNNOutPut.xls file we can see that estimation is pretty effective and synchronizes fast (the RMSE here has false value as Yahoo Input cannot load 6 different stock prices paralelly). • The most critical parameter in TDNN is width of hidden layer: iterate it for optimal value!

References • Joone: • Joone notes in CANAL-format (Hungarian): JOONENotes.doc • Joone manual (English): JOONEManual.pdf • Temporal Associative Memories: • TAM simulation in Java Appletben: http://franosch.org/applet.html • TAM application in robot movement control: • http://www.informaworld.com/smpp/content~content=a713871964~db=all~order=page • Bibliography: http://www.nada.kth.se/sans/annrep94/publications.html • Jordan-type Recurrent Backpropagation: • Internet tutorial: http://www.vias.org/tmdatanaleng/cc_ann_recurrentnet.html • Textbook in PDF using it in financial forecasting: http://dapissarenko.com/resources/fyp/Pissarenko2002.pdf • Elman-type Recurrent Backpropagation: • Mathworks tutorial: http://www.mathworks.com/access/helpdesk/help/toolbox/nnet/histori4.html • Internetes textbook: http://www.willamette.edu/~gorr/classes/cs449/rnn1.html • Neural Networks Toolbox tutorial: http://www.uweb.ucsb.edu/~weibin/nnet/recur93.html • Backpropagation Trough Time: • Introduction into theory: http://axon.cs.byu.edu/~martinez/classes/678/Papers/Werbos_BPTT.pdf • Internetes tutorial: http://www.dlsi.ua.es/~mlf/nnafmc/pbook/node28.html • Time Delay Neural Networks: • Theoretic introduction: http://www.nada.kth.se/~orre/snns-manual/UserManual/node176.html • Usage in Mathworks: http://www.mathworks.com/access/helpdesk/help/toolbox/nnet/dynamic7.html

Programming Neural Networks and Fuzzy Systems in FOREX Trading