Implementing Legacy Statistical Algorithms in Excel using VBA
Learn how to convert FORTRAN code to VBA and run classic statistical algorithms in Excel for graphical and computational analysis. Explore VBA fundamentals and Excel's powerful features.
Implementing Legacy Statistical Algorithms in Excel using VBA
E N D
Presentation Transcript
Implementing Legacy Statistical Algorithms in a Spreadsheet Environment Stephen W. Liddle Information Systems Faculty Rollins eBusiness Center John S. Lawson Department of Statistics Brigham Young University Provo, UT 84602
Overview • Introduction • Fundamentals of VBA in Excel • Retargeting traditional algorithms to a spreadsheet environment • Converting FORTRAN to VBA • Conclusions
Why Convert FORTRAN Programs to Run in a Spreadsheet Environment? • Useful code available that is not implemented in standard statistical packages • FORTRAN compilers not usually available on normal Windows workstation • Many textbooks refer to published FORTRAN algorithms
Sources for Published FORTRAN Algorithms • STATLIB (http://lib.stat.cmu.edu/) • General Archive • Applied Statistics Archive • Journal of Quality Technology Archive • JASA Software Archive • JCGS Archive
Advantages of Running Legacy FORTRAN Code in Excel • Comfortable environment for practitioners • More user friendly input from spreadsheet • Output to spreadsheet allows further graphical and computational analysis of results with Excel functions
Proposed Methodology • Understand original FORTRAN program • Choose suitable I/O methods • Convert original FORTRAN code to VBA • Test and use resulting Excel code
Visual Basic For Applications • Built on ANSI BASIC • Language engine of Microsoft Office • Modern structured programming language • Has vast array of types, functions, programming helps • Powerful support environment (Office platform) • Popular in business contexts
Application Workbooks (Workbook) Worksheets (Worksheet) Range Chart Excel Object Model • Objects in Excel are addressable in VBA • Each object has: • Properties • Methods
Output Region Input Region Clicking these buttons runs the ORPS1 and ORPS2 algorithms. Input/Output Methods • Non-interactive • Files, databases • Worksheet cells • Interactive • Message boxes • Input boxes • Custom GUI forms
FORTRAN vs. VBA • VBA: (-b+Sqr (b^ 2-4*a*c))/(2*a) • FORTRAN: (-b+SQRT(b**2-4*a*c))/(2*a)
More Operators • .EQ. = • .NE. <> • .LT. < • .LE. <= • .GT. > • .GE. >= • .AND. And • .OR. Or • .NOT. Not • // &
Data Types • INTEGER Byte, Integer, Long • REAL Single • DOUBLE PRECISION Double • COMPLEX Non-primitive in VBA • LOGICAL Boolean • CHARACTER String • CHARACTER*length String*length • Other notable VBA types: • Currency, Decimal, Date, Variant
Worksheet Functions • ChiDist(x,deg_freedom) • Returns one-tailed probability of the χ2 distribution. • Correl(array1,array2) • Returns the correlation coefficient of two cell ranges. • Fisher(x) • Returns the Fisher transformation at a given x. • Pearson(array1,array2) • Returns the Pearson product moment correlation coefficient for two sets. • Quartile(array,quart) • Returns the requested quartile of a data set. • StDev(array) • Returns the standard deviation of a data set. • ZTest(array,x,sigma) • Returns the two-tailed P-value of a z-test.
Subtle Differences (“Gotchas”) • Implicit conversion of real to integer values • FORTRAN: truncate • VBA: round • Solution: use VBA’s Fix(), which truncates • Both languages allow implicit typing • This introduces ambiguity • Solution: supply explicit types everywhere
Eliminating Goto Statements • Computer science accepts the axiom that goto is generally “considered harmful” • We advocate rewriting alogrithms to use structured programming techniques where feasible • Sine qua non is “make it work” • It’s a good idea for maintainability, understandability to move to structured form
Eliminating Goto Statements DO 8 J=1,3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 CONTINUE
Eliminating Goto Statements For j=1 To 3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 IF(J.EQ.3) GO TO 8 XK=BESTK-STEP 8 Next j
Eliminating Goto Statements For j=1 To 3 ... 6 ... IF(OBJFN.GT.BESTFN) GO TO 7 ... GO TO 6 7 If j <> 3 Then xk = bestk - step End If Next j
Eliminating Goto Statements For j=1 To 3 ... Do Until objfn > bestfn ... Loop If j <> 3 Then xk = bestk - step End If Next j
Our Reasoning • Digital assets are fragile • FORTRAN is not universally available • Excel is a ubiquitous, powerful platform • VBA is a full-featured language capable of handling sophisticated statistical computations
Conclusions • We recommend creating a Web-based repository of Excel/VBA implementations of classic statistical algorithms • We can preserve our legacy algorithms in this modern spreadsheet environment • E-mail us if you want a copy of our manuscript (liddle or lawson@byu.edu)