1 / 15

Project: Primer design for cancer genomics

Project: Primer design for cancer genomics. Cancer genomics. In cancers, large genetic changes can occur, including deletions, inversions, and rearrangements of genomes In the early stages, only a few cells will show this. deletion. Polymerase Chain Reaction.

randali
Télécharger la présentation

Project: Primer design for cancer genomics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Project: Primer design for cancer genomics Stefano/Hossein

  2. Cancer genomics • In cancers, large genetic changes can occur, including deletions, inversions, and rearrangements of genomes • In the early stages, only a few cells will show this deletion Stefano/Hossein

  3. Polymerase Chain Reaction • PCR is a technique for amplifying and detecting a specific portion of the genome • Amplification takes place if the primers are ‘appropriate’ distance apart (<2kb) Stefano/Hossein

  4. Assaying for Rare Variants • PCR can be used to assay for a given genomic abnormality, even in a heterogenous population of tumor and normal cells Detection PCR Extract Genomic DNA Distance too large for amplification Tumor cell Stefano/Hossein

  5. Primer Approximation Multiplex PCR (PAMP)* • Multiple primers are optimally spaced, flanking a breakpoint of interest • Upstream of breakpoint, forward primers • Downstream of breakpoint, reverse primers • The primers are run in a multiplex PCR reaction • Any pair can form a viable product Patient B Patient C Deletion Deletion Stefano/Hossein

  6. Goal • Input, a collection of primer locations and matrices of primer interactions • Forward/Forward, Forward/Reverse, Reverse/Reverse • Identify a subset of primers that do not interact, are unique, maximizing the covered region Stefano/Hossein

  7. Algorithms for Optimizing the Cost • Preprocessing • Determining the pairs of primers that dimerize (Edges in the graph) • Filtering the primers to ensure “uniqueness” • Simulated annealing • Start from an initial candidate set P, generated randomly or greedily. • List the neighboring setsP’and compute • Select step s with a probability proportional to • Decrease the temperature T and go to step 2. Stefano/Hossein

  8. Cost Function • The cost function used takes coverage and dimerization into account Coverage Density Dimerization Stefano/Hossein

  9. Simulated Annealing: Define Neighbors • Approach 1: • Set • E is the edge set corresponding to dimerizing pairs • Neighbors of P are formed by adding a vertex u to P and removing all vertices dimerizing with u; i.e. • Approach 2: • No hard constraint on dimerizing pairs. • Neighbors of P are obtained by adding or removing one vertex from P. Stefano/Hossein

  10. ILP Formulation : indicator of primer i being selected. : indicator of candidate primer i being immediately after primer j. • Guaranteed optimality, but intractable for realistic problems • Used here to assess the performance of simulated annealing Stefano/Hossein

  11. Bounds and Numerical Results • A Weak Theoretical Upper Bound: • Select all primers without dimerization constraints. • For any two adjacent primers with distance reduce the covered region by bp. Stefano/Hossein

  12. Potential Improvements • Improving the cost function formulation • Incorporating multiplexing sets • Find an efficient technique to solve the optimization problem. • Improve on the analytical bound • consider the effect of dimerization within the forward/reverse primer set. Stefano/Hossein

  13. Pairwise cost function • Measures total possible number of sites that are uncovered given all forward and reverse primer combinations

  14. Multiobjective cost function • Taking coverage and multiplexing sets into account • Minimizing both objectives, and resolving the dimerization constraint, given a possible solution containing mutliplexing sets S Sets Missed coverage

  15. Using Fewer Integer Variables The formulation in the paper uses n2 auxiliary variables, one for each pair of primers. qij=1 if and only if primers i and j are selected as two consecutive primers in the candidate set. Complexity of ILP (or IQP) generally grows exponentially with the number of integer variables. In practice, the distance between two consecutive primers in the solution is not much larger than d, otherwise there would be a large gap in the covered region. Assume a maximum g on the maximum distance Introduce a variable qij if li – lj < g The average number of variables is reduce to n(1+ρg) ρ is the density of the primers in the initial set. The number of integer variables becomes O(n).

More Related