Introduction to Theory-Guided Machine Learning: Why Theory Matters

AI-ML Introduction Part 2 Theory Guided Data Learning

Outline • Review of previous talk • Essential ML • Data driven to theory-guided ML • Why theory-guided? • Example: lake temperature modeling (Karpatne et al) • Applications in CSAM projects • Conclusions

What is ML, essentially? • A thing-labeler • Input: thing • Output: label • Labels can be • Discrete e.g. “spam”, “not-spam”) • Continuous e.g. price of stock - $31.71, $12.57 … • ML communicates the programmer’s wish to automatically label data • Programmer implements learning algorithms • Programmer provides examples, not explicit instructions on how to label • Examples express what cannot be expressed by computer code alone

Big Data in Science and Engineering MaterialsScience • Manufacturing • process flaws • improve production quality • increase efficiency • save time and money Genomics EarthScience 3

The black / opaque box of ML DeepLearning Input Output “Black-box” models learn patterns solely from data without relyingon scientificand technical knowledge 1 … Input Output N • Successful commercialapplications: 4

Essential building blocks of ML • Model Design / Architecture • Layers • Activation functions • Optimizers • Hyperparameters • Tunable knobs • Learning rate • Dropout probability Model Parameters / Weights learned from data

Workflow Select hyperparameters Track progress Evaluate performance Train with data

Why Do Black-box Methods Fail? • Scientific & engineeringproblems are oftenunder-constrained • Complex, dynamic, and non-stationaryrelationships • Large number of variables, small number ofsamples • Standard methods for evaluating ML models (e.g., cross-validation) breakdown • Easy to learn spurious relationships that look deceptively good on training and testsets • But lead to poor generalization outside the availabledata

Why Do Black-box Methods Fail? • Interpretability is an important end-goal (esp. in scientific problems) • Need to explain or discover mechanisms of underlying processes to… • Form a basis for scientificadvancements • Safeguard against the learning of non-generalizablepatterns 12/8/17 9

Theory-based vs. Data ScienceModels Contain knowledge gaps in describing certain processes (turbulence, groundwaterflow) Theory-basedModels • Examples from science – • GravitationalLaw • Conservation of Mass, Momentum,Energy • Navier-StokesEquation • Schrodinger’sEquation 10

Theory-based vs. Data ScienceModels Theory-basedModels Theory-guided Data ScienceModels (TGDS)1 Contain knowledge gaps in describing certain processes Take full advantage of data science methods withoutignoring the treasure of accumulated knowledge in scientific“theories” Data ScienceModels Require large numberof representativesamples 1 Karpatne et al. “Theory-guided data science: A new paradigm for scientific discovery,” TKDE2017 11

The bias vs variance trade off

Learning Physically ConsistentModels • Traditionally, “simpler” models are preferred forgeneralizability • – Basis of several statistical principles such as bias-variancetrade-off • M1 (less complexmodel): High bias—Lowvariance • M3 (more complexmodel): Low bias—Highvariance Accuracy +Simplicity GeneralizationPerformance 13

Learning Physically ConsistentModels • Traditionally, “simpler” models are preferred forgeneralizability – Basis of several statistical principles such as bias-variancetrade-off M1 (less complexmodel): High bias—Lowvariance M3 (more complexmodel): Low bias—Highvariance • In scientific problems, “physical consistency” can be used as another measure ofgeneralizability • Can help in pruning large spaces of inconsistentsolutions • Result in generalizable and physically meaningfulresults • Generalization Performance Accuracy + Simplicity +Consistency 14

Physics-Guided Neural Networks(PGNN) A Framework for Learning Physically Consistent Deep LearningModels Scientific Knowledge(Physics) Used to guide selection of model architecture, activation functions, loss functions,… Karpatne et al., “Physics-guided neural networks(PGNN): Application in Lake Temperature Modeling,” SDM 2018 (in review; arXiv:1710.11431). 15

Case Study: Lake TemperatureModeling InputDrivers: Short-waveRadiation, Long-wave Radiation, Air Temperature, Relative Humidity, WindSpeed, Rain,… TargetOutput: Temp. of water at everydepth Temp Physics-based Approach: General Lake Model(GLM)1 • Captures physical processes responsible for energybalance • Requires lake-specific calibration using large amounts of data and computationalresources RMSE of Uncalibrated Model:2.57 RMSE of Calibrated Model:1.26 (for Lake Mille Lacs inMinnesota) 16 1Hipsey et al.,2014

PGNN1:Use GLM Output as Input in NeuralNetwork • Deep Learning can augment physics-based models by modeling theirerrors • Part of a broader research theme on creating hybrid-physics-datamodels InputDrivers + Output ofGLM (Uncalibrated) … 17

PGNN2:Use Physics-based LossFunctions • Temp estimates need to be consistent with physical relationships b/w temp, density, anddepth PhysicalConstraint: Denser water is at higherdepth 18

Summary of approaches • Theory-guided Design and Learning • Design of model architecture • Choice of loss function including regularization • Constrained optimization Methods • Theory-guided refinement • Post-processing • Pruning • Augmenting Theory-based Models using Data • Calibrating Model Parameters • Data Assimilation

Ensemble and boosting algorithms: applications to CSAM projects • Ensemble of weak models provides better predictive power • Boosting the weight of misclassified data to re-train a model creates an – done many times creates a certain type of ensemble with better predictive power

Online learning: applications to CSAM projects • Start from theory generated data • Perform experiments • Since experiments are slow and expensive we get a stream of points over time • Model which started with theory generated now adjusts by learning from data coming in the stream

ConcludingRemarks • “Black-box” e.g. deep learning methods are not sufficient for knowledge discovery in scientific and engineering domains • Scientific and engineering knowledge can be combined with machine learning. This approach has different names, one of the most popular being “theory-guided data science” • Use of scientific and engineering knowledge ensures consistency with physical laws • Many applications for this approach including different sub-projects of CSAM

References • EarthScience: • MaterialScience: • Curtarolo et al., “The high-throughput highway to computational materials design,” Nature Materials,2013. • ComputationalChemistry: • Li et al., “Understanding machine-learned density functionals,” International Journal of Quantum Chemistry,2015. • Karpatne et al., “Physics-guided Neural Networks: Application in Lake Temperature Modeling,” SDM 2018 (inreview). • Faghmous et al., “Theory-guided data science for climate change,” IEEE Computer,2014. • Faghmous and Kumar, “A big data guide to understanding climate change: The case for theory-guided data science,” Big data,2014. • FluidDynamics: • – Singh et al., “Machine learning- augmented predictive modeling of turbulent separated flows over airfoils,” arXiv,2016. “Physical Analytics” ResearchDivision 12

ThankYou! 22

Introduction to Theory-Guided Machine Learning: Why Theory Matters

Introduction to Theory-Guided Machine Learning: Why Theory Matters

Presentation Transcript

G5AI AI Introduction to AI

Introduction to ML

Introduction to ML – Part 1

INTRODUCTION PART 2

Statistical Methods in AI/ML

G5AI AI Introduction to AI

Introduction to ML

Introduction to ML

AI - Introduction

Introduction Part 2

G5AI AI Introduction to AI

Introduction to ML

G51I AI Introduction to AI

Introduction to ML - Part 2

Introduction to ML

AI and ML

AI and ML Certification‎ Training, AI and ML Online Course – Dig-iot-ai

Regulation and Legislation of AI/ML

AI History, Philosophical Foundations Part 2

Introduction to ML

Introduction to ML

Java Introduction part 2