1 / 37

CSE891-002 Selected Topics in Bioinformatics

CSE891-002 Selected Topics in Bioinformatics. Jin Chen 232 Plant Biology Bld. 2011 Spring. About me…. Jin Chen, Assistant Professor in CSE and PRL from 2009 Office: 232 Plant Biology Lab. Tel: (517) 355-5015. Email: jinchen@msu.edu. Outline. Course Description

ohio
Télécharger la présentation

CSE891-002 Selected Topics in Bioinformatics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE891-002 Selected Topics in Bioinformatics Jin Chen 232 Plant Biology Bld. 2011 Spring

  2. About me… • Jin Chen, Assistant Professor in CSE and PRL from 2009 • Office: 232 Plant Biology Lab. Tel: (517) 355-5015. Email: jinchen@msu.edu

  3. Outline • Course Description • Introduction to Computational Network Biology

  4. Course Description • Course objectives: study interesting computational network biology problems and their algorithms, with a focus on the principles used to design those algorithms. (3 credits) • Instructor: Jin Chen, Office: 232 Plant Biology Bld. Email: jinchen@msu.edu • Office hours: Thursday 2PM-3PM. If you cannot attend office hours, email me about scheduling a different time. • Web page:http://www.msu.edu/~jinchen/cse891a

  5. Course Description • Course work: One 80 minutes lecture, and 80 minutes of discussion & student presentations each week • Grading policies: The course will be graded on attendance (10%), participation (20%), and presentation (70%). • No Final Exam

  6. Course Description • Prerequisites: Graduate students in science or engineering. Note: an override is necessary for non-CSE graduate students; please send your PID & NetID to me. • No prior knowledge of biology is required. Computationally inclined biology graduate students are encouraged to take the class as well.

  7. Suggested books • A.-L. Barabási, Linked: The new science of networks • U. Alon, An Introduction to Systems Biology • B. Palsson. Systems Biology: Properties of Reconstructed Networks • K. Kaneko, Life: An Introduction to Complex Systems Biology

  8. Course Description Network Biology Graph Mining

  9. Paper list • Chua et al. Exploiting indirect neighbours and topological weight to predict protein function from protein–protein interactions. Bioinformatics (2006) 22 (13): 1623-1630. • Kashani et al. Kavosh: a new algorithm for finding network motifs. BMC Bioinformatics 2009, 10:318 • Deng et al. Prediction of Protein Function Using Protein–Protein Interaction Data. Journal of Computational Biology. December 2003, 10(6): 947-960. • Hu et al. Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics. Vol. 21 Suppl. 1 pp. i213–i221. 2005 • Xu et al. Mining Shifting-and-Scaling Co-Regulation Patterns on Gene Expression Profiles. ICDE 2006 • Xu et al, Discovering cis-Regulatory RNAs in Shewanella Genomes by Support Vector Machines. PLoS Computational Biology. 5(4) 2009 • Huang et al. Large-scale regulatory network analysis from microarray data: modified Bayesian network learning and association rule mining. Decision Support Systems. 43. 1207–1225. 2007 • Honkela et al. Model-based method for transcription factor target identification with limited data. PNAS vol 107(17) pp. 7793–7798. 2009 • Vermeirssen et al. Transcription factor modularity in a Gene-Centered C. elegans Protein-DNA interaction network. Genome Research 17, 061-1071. 2007 • Covert et al, Transcriptional Regulation in Constraints-Based Metabolic Models of Escherichia coli, Journal of Biological Chemistry, 277(31): pp. 28058-28064. 2002 • Herrgard et al. Integrated analysis of regulatory and metabolic networks reveals novel regulatory mechanisms in Saccharomyces cerevisiae. Genome Research. 16:627–635. 2006 • Barabási et al. Network Biology: Understanding the Cell's Functional Organization. Nature Reviews Genetics 5, 101-113. 2004 • Dongen. A cluster algorithm for graphs. Technical Report INS-R0010, National Research Institute for Mathematics and Computer Science in the Netherlands, Amsterdam, May 2000 • Huan et al. Mining Family Specific Residue Packing Patterns from Protein Structure Graphs, RECOMB, pp. 308-315, 2004

  10. Course Description • Select at least one paper for presentation from the paper list. Email me which paper you will present by next Mon (1/17/2011) • Each presentation is 45 min, including 15 min Q&A, followed with a discussion • Your grade will be largely determined by the presentation (70%) • Presentation starts from next Tue (1/18/2011)

  11. Important Days: Class Begins   1/10/2011     Open adds end   1/14/2011 Last day to drop with refund   2/3/2011 Last day to drop with no grade reported   3/2/2011     Class Ends   5/6/2011

  12. Introduction to Computational Network Biology • Network biology belongs to systems biology, which belongs to genomics • Interested in the relations between entities rather than the entities themselves http://bionet.bioapps.biozentrum.uni-wuerzburg.de/

  13. Network’s everywhere • Internet, social network, anti-terrorism network • Biological networks • Protein-protein interaction (PPI) network • protein-DNA interaction network • gene correlation network • gene regulatory network • metabolic network • signaling network… • Network is a tool for under standing complex systems • Network models explains network properties and support network behavior study • Network measures provide quantitative analysis for complex systems

  14. Definition of network (graph) Self-loop Multi-set of edges Edge G(V,E) Node (vertex) Simple graph: does not have loops (self-edges) and does not havemulti-edges.

  15. Definition of network (graph) Directed graph vs. Undirected graph Labeled graph vs. Unlabeled graph Symmetric graph vs. Asymmetric graph

  16. Webpage layout Pages on a web site and the hyperlinks between them M. Newman and M. Girvan. Finding and evaluating community structure in networks. Phys. Rev. E 69, 026113 2004

  17. Adopted from R Albert’s slides

  18. Biological networks

  19. Yeast Protein-Protein Interaction network HawoongJeong

  20. Gene regulation network of sea urchin Eric Davidson

  21. Metabolic flux analysis of E. coli AbhishekMurarka

  22. Why study networks? • Complex systems cannot bedescribed in a reductionist view • Behavior study of complex systems starts withunderstanding the network topology • Network - related questions: • How do we reconstruct a network? • How can we quantitatively describe large networks? • How did networks get to be the way they are?

  23. Simple measures • Node Degree: the number of edges connected to thenode • In-degree & Out-degree • Total in-degree == total out-degree • Average Degree: the average of node degrees for all the nodes in the network, denoted as: • Degree distribution: the degree distribution P(k) gives the fraction of nodes that have k edges where N is the number of nodes in the network, ki is the node degree of node i

  24. Simple measures • Shortest path: to find a path between two nodes such that the sum of the weights of its constituent edges is minimized • Graph diameter: the longest shortest path between any pair of nodes in the graph. • Connected graph:any two vertices can be joined by a path • Bridge: if we erase the edge, the graph becomes disconnected

  25. Simple measures • Betweenness centrality: for all node pairs (i, j),find all the shortest paths between nodes i and j, denoted asC(i,j), and determine how many of these pass through node k, denoted as Ck(i,j).Betweenness centrality of node k is • Calculating the betweenness involves calculating the shortest paths between all pairs of vertices on a graph. O(V2logV + VE) for sparse graph with Johnson’s algorithm. L. C. Freeman, Sociometry 40, 35 (1977); P. E. Black, Dictionary of Algorithms and Data Structures (2004)

  26. Complex measures • Frequent subgraph mining • Graph comparison & classification • Graph isomorphic testing

  27. Useful software • Visualization & Topological Analysis • Cytoscape (www.cytoscape.org) • Pajek (vlado.fmf.uni-lj.si/pub/networks/pajek) • Graph related programming • LEDA (www.algorithmic-solutions.com) • Nauty (www.cs.sunysb.edu/~algorith/implement/nauty/implement.shtml)

  28. 1960 1999 2002

  29. Real networks are much more complex • Transcription regulatory networks of Yeast and E. coli show an interesting example of mixed characteristics • how many genes a TF interacts with • how many TFs interact with a given gene - scale-free - exponential

  30. Modularity and network motif • Cellular function are likely to be carried out in a highly modular manner • Modular -- a group of genes/proteins that work together to achieve distinct functions • Biology is full of examples of modularity

  31. Remaining challenges • Discovery of network motifs is closely related to the generation of random networks • Structure of network motifs does not necessary determine function • Relation between higher-level organizational, functional states and networks has not yet been studied Voigt, W. et al. Genetics 2005 Ingram P.J.et al. BMC Genomics 2006 Eric Werner. Nature 2007

  32. Next class • PPI network construction • False-positive detection

More Related