1 / 13

PSO for Bioinformatics

PSO for Bioinformatics. Alex Freitas and Colin Johnson University of Kent. People involved in swarm intelligence research at Kent (1). XPS Project Alex Freitas (Lecturer) Colin Johnson (Lecturer) Elon Correa (RA – will start soon) Mudassar Iqbal (PhD student – started Nov. 2004)

kburton
Télécharger la présentation

PSO for Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PSO for Bioinformatics Alex Freitas and Colin JohnsonUniversity of Kent

  2. People involved in swarm intelligence research at Kent(1) • XPS Project • Alex Freitas (Lecturer) • Colin Johnson (Lecturer) • Elon Correa (RA – will start soon) • Mudassar Iqbal (PhD student – started Nov. 2004) • Initially investigating a dynamic neighborhood topology • Interested in bioinformatics – problem to be defined

  3. People involved in swarm intelligence research at Kent(2) • Other research students • Terry Arnold (3rd-year PhD student) • Supervised by Colin • Doing research on force-based PSO • Nick Holden (MRes student) • Supervised by Alex • Doing research on “A hybrid PSO/ACO algorithm for hierarchical classification of biological data (enzymes)” • Allen Chan (MRes student) • Supervised by Alex • Doing research on an ACO algorithm for classification of biological data – multi-label classification problem

  4. Introduction to Classification • Each record (example) belongs to a predefined class • Each example consists of two parts: • < predictor attributes, class attribute >, e.g.: • < Gender = M, Age = 25, Salary = 35,000, credit = good > • < attributes_describing_protein, function = transport > • Goal: to predict the class of an example, based on the values of the predictor attributes for that example

  5. Hierarchical classification (1) • Hierarchical classes Enzyme Commision root (EC) codes have 4 levels, e.g. EC.1.1.1.1 1 2 most general class 1.1 1.2 1.3 2.1 2.2 most specific class

  6. Hierarchical classification (2) • Challenges • Several predictions must be made for each example – one predicted class at each level of the hierarchy • As we go down the hierarchy, there are fewer examples (records) per class – “data fragmentation” • Opportunities • Information of “class similarities” in the hierarchy • Top-Down approach: first predict top-level class, then predict second-level class among children of predicted top-level class, etc., until a leaf class is predicted • Cost of misclassifying 1.1 into 1.2 is smaller than cost of misclassifying 1.1 into 2.1

  7. A hybrid PSO/ACO algorithm – basic ideas • Each particle represents a candidate classification rule • Continuous (real-valued) attributes – standard PSO • Categorical (nominal) attributes – special treatment; e.g. Gender: “F” or “M” (unordered values) • Each categorical attribute is represented by a “pheromone vector”, with one element for each attribute value plus one element for “not used in rule” F M “off” (not used in rule) Pheromone: 0.6 0.1 0.3 General motivation: ACO algorithms, using pheromone, cope well with discrete data

  8. A hybrid PSO/ACO algorithm for predicting hierarchical enzyme classes (1) • Class attribute: 4-digit EC code (4 levels of classes) • Predictor attributes: Prosite patterns (motifs) • A particle represents a classification rule: pattern1 . . . . . patternn yes no off yes no off class 0.3 0.1 0.6 0.8 0.1 0.1 EC.1.5.2.1 The particle is “decoded” into a rule by choosing a value (“yes”, “no”, “off”) for each attribute, with probability given by its pheromone vector • Pheromone values are updated based on rule quality • Particle also moves towards previous best and local best

  9. A hybrid PSO/ACO algorithm for predicting hierarchical enzyme classes (2) • Algorithm follows top-down (greedy) approach: • first discover rules predicting 1st-level class, then discover rules predicting 2nd level class, etc. • this sequential procedure is used in both training and testing • Preliminary results (varying some parameters) • Predictive accuracy at level 1 (6 classes): 94.9-96.7% • Predictive accuracy at level 2 (51 classes): 72.3-90.3% • Current/Future work • Prediction of levels 3 and 4 of EC code; other data sets • Consider different misclassification costs • Develop a less greedy method for top-down classification (allowing the recovery from errors in higher levels)

  10. Force-based particle swarms • Drawing inspiration from physics • In particular, ways of simulating fluid flow • The idea is to control the flow of particles by assigning forces between particle types, then letting the process run to completion. • We can use different force types: • Electromagnetic forces • Gravitational forces • Linear distance-based forces • Lennard-Jones potential • ...

  11. Force based programming language • One idea is to create a force-based programming language. • We express the problem by saying how forces between pairs of particle types interact. • Example: clustering • Create fixed particles for the data • Create k classes of particles for the cluster-markers • Rules: • All cluster-markers repel at close range • Cluster-markers of different types always repel • Cluster-markers are attracted to data.

  12. Demonstrations and videos

  13. Applications • Currently applying this to classification algorithms in bioinformatics. • Data points will be fixed in the space. • Particle attraction/repulsion will be learned using a GA/GP type strategy to learn: • The forces that apply between particle types • The shapes of the possible force profiles.

More Related