Unraveling Patterns in Low Replicate Data Using PCA Analysis for Medical and Ecological Studies

A blind search for patterns Unravelling low replicate data

ExSpec Pipeline

Data: Structure and variability • Structure • Between 500-10,000+ features • Each feature has an associate ion count for each sample aligned. • Data is not normally distributed. • Variability • Up to 30% technical variability • Each feature is effected differently

Data Structure and variability

Data: Structure and variability The majority of features that are detected are singletons.

Low Replicate data • “Suck it and see” • One off project • Pump priming projects • Medical samples • Biopsy • Difficult to access • Ecological data • Resampling is difficult

Methods • Finger printing • PCA • Basic scoring • PDE model • Gradient search • Differential analysis

PCA • Very simple • Can be highly informative • Depends on the data • Used in pipeline • Data quality

Bruno Project • Samples : • Human biopsy • Replication – biopsy cut into equal parts PCA Analysis

N group • Non-cancer biopsy • T group • Cancer biopsy PCA Analysis Using PCA clustering we are able to distinguish between healthy and sick patients

PCA Analysis PCA reveled profile similarity which correlated with biological evidence

PCA Analysis • Human Urine project • 22 patients sampled • 11 healthy and 11 sick patients • Sample labels dropped

PCA Analysis Ecological Data Large number of samples without clear replication.

PCA Analysis • Cluster pattern: • Find the features which hold the cluster pattern

PCA Analysis Using PCA and profile similarity analysis subset of features of interest were found

Basic Scoring • Use Z-score to sort data • Use this to pull out important features. • Control – Exp • With two class problem we can use PDE modelling.

Basic Scoring : PDE modelling • Multi class problem • Plants • Wild type • act ko mutant • Treatments • Normal light • High light

Gradient Analysis • Use rate of change of abuandace to • Mine data for spesifc trends • Find features of intrest • Use PDE modelling of rates

Gradient Analysis Mining for features which showed rapid increase due to a specific treatment

Data Provided by: • Ecological data • Dave Hodgson • Nicole Goody • Gradient analysis • John Love • Data scoring • Nicholas Smirnoff • Mike Page • Brno • Ted Hupp • Rob O’Neill • Urine study • Steve Michell • John Mcgrath

Metabolomics and Proteomics Mass Spectrometry Facility @ The University of Exeter http://biosciences.exeter.ac.uk/facilities/spectrometry/ http://bio-massspeclocal.ex.ac.uk/ Nick Smirnoff (Director of Mass Spectrometry) N.Smirnoff@exeter.ac.uk Hannah Florance (MS Facility Manager) H.V.Florance@exeter.ac.uk Venura Perera (Bioinformatics and Mathematical Support) V.Perera@exeter.ac.uk

About me • Background • Applied Maths • Untargeted metabolite profiling • Research interests • Data driven modelling • Small molecule profiling • Gene regulatory network modelling • Application of mathematical methods • Metabolite identification using LC-MS/MS

Unraveling Patterns in Low Replicate Data Using PCA Analysis for Medical and Ecological Studies

Unraveling Patterns in Low Replicate Data Using PCA Analysis for Medical and Ecological Studies

Presentation Transcript

THE SEARCH FOR NEW TESTAMENT PATTERNS

A Blind Analysis

Emergency Services Search Patterns

Blind Search

Blind Search-Part 2

Search Patterns

Search Patterns

Introduction to Artificial Intelligence Blind Search

Uninformed (also called blind) search algorithms)

Solving Problems: Blind Search

Deciphering Mobile Search Patterns: A Study of Yahoo! Mobile Search Queries

Blind and Informed Search Methods

Introduction to Artificial Intelligence Blind Search

Pulsar Blind search on DC2 data

Solving Problems: Blind Search

Blind State-Space Search

Uninformed (also called blind) search algorithms)

VENETIAN BLIND, A VERSATILE BLIND FOR YOUR ROOMS AND WINDOWS

donate for blind