E N D
분자 유전학 실험실 AJOU UNIVERSITY Opti-Sequence Finder : A bioinformatics tool for an antisense technique in wet experimentsHo-Sang Jeon2, Seong-Jo Kim2, Seung-Pyo Hong1, Hyon-Chang Kim1, Churl K. Min3, Han Jip Kim1,*1Department of Biological Sciences, 2Division of Information & Computer Engineering, 3Department of Molecular Science & Technology, Ajou University, Suwon 443-749, Korea *Corresponding to Han Jip Kim(hjkim@ajou.ac.kr) Abstract Introduction Statistical approach Antisense technique is a formidable method to regulate gene expression in modern molecular biology research. The antisense technique uses antisense oligomers that interfere the gene expression or the transcription by binding complementary with the target gene or the mRNA. The antisense oligmers such as, Methylphosphonate(MP) – oligomer and peptide nucleic acid(PNA) oligomers have high stability against nuclease and strong bonding strength with the target gene or mRNA. However, antisense oligomers may cause several problems such as, the quality of oligomers, the stability to nuclease, the accessibility to the target sequence and the nonspecificity of an oligomer due to the off-target effects resulting from the unintended interaction with other genes. Especially an antisense oligomer which has low specificity gets near unspecific factors and causes a side effect easily.[2] Here we present the bioinformatics tool that will help biologists design more accurate and effective antisense oligomers by minimizing the chance of off-target effects. • The equilibrium constant K0 between P,T (PNA bound to the target), P(free PNA), and T(free, unblocked target sequence s), can be described by equation (1). • I = The number of Class1 mismatches, A-T • J = The number of Class2 mismatches, G-C • f1 = The frustration factor of Class1 • f2 = The frustration factor of Class2 • D = DNA (off-target sequence) • P(DIJ) is the probability for a consecutive sequence of n bases on the DNA to have exactly I mismatches in class1, J mismatches in class2 P indicates Peptide Nucleic Acid (PNA), a popular type of antisense oligomers. First, the equilibrium (binding) constant KIJ of each number of mismatches is calculated with considerations of the length of an antisense oligomer and the strength of chemical reactions ( Eq.1, 2).Then the tool calculates the probability of an antisense oligomer binding to the target sequence or to the off-target sequence by screening proportion of each sequence in the whole genomic sequence. Since [P.DIJ] can not be directly obtained the equation 3 is modified by the equation 4, 5.[4] Antisense oligomers effectively block the transcription or the expression of functional gene based on the sequence homology. However, there is no trustworthy way to ensure that the designed antisense oligomers will only act on targeted genes without interacting with other genes. Here we have developed a tool to find optimized sequences that prevent the off-target effects of the antisense oligomers. We used the Smith-Waterman algorithm, but it required too much memory spaces to run on a personal computer. So we modified the algorithm similarly to those used in Basic Local Alignment Search Tool (BLAST) algorithm[1]. Assuming the length of a genome sequence is n and target gene length is m, the memory space required for Smith-Waterman algorithm is O (nm), whereas our algorithm only requires O (m2) by reducing the compared range from the whole genome sequence to expected location. Then we applied a statistical approach to examine the interactions between antisense oligomers and target genes. Using our program we reviewed other published antisense data to see if there were any off-target effects. *Supported by grant No. RTI04-03-05 from the Regional Technology Innovation Program of the Korean Ministry of Commerce, Industry and Energy (MOCIE). Basic principles First-generation antisense oligos are comprised of natural genetic material and often contain crosslinking agents for irreversible binding to their targets. A number of nonnatural antisense structural types were developed in an effort to improve stability and the delivery efficiency.(Fig. 1.A.) There are also several types of novel antisense that no longer resemble the nucleic acids. These oligos contain acyclic backbone moieties, including nylon (Fig 1.B.) [3] Conclusion Figure 1. Representative antisense structural types Table 1. The evaluation of the off-target effects of the PNA oligomer targetting acpP gene of Echerichia coli K12 Algorithm We reviewed an article which observed the bacterial growth inhibition by treating a PNA oligomer targeting the messenger RNA (mRNA) encoding the essential fatty acid biosynthesis protein Acp.[5] The antisense sequence comprised of 10 bases, included the start codon of AcpP gene to effectively block the initiation of the translation process. Using our tool, we measured the off-target effects of the antisense sequence. As shown above in Table 1, 8 sequences were selected as the candidate sequences including the sequence used in the article. All the sequences include the start codon of AcpP gene. We found several sequences that minimized the undesirable off-target effects that could have been provided better result for the experiment. The tool will provide useful information to biologists predicting the least off-target sequences in wet experiment. Figure 2. Target exclusive genome sequence References Figure 3. Sequence partitioning • Stephen F. Altschul, Warren Gish, Webb Miller, Eugene W. Meyers and David J. Lipman, “Basic Local Alignment Search Tool”, Journal of Molecular Biology, Volume 215, Issue 3, 5 October 1990, Pages 403-410 • http://plaza.snu.ac.kr/~heo1013/98-1.htm • James Summerton, Dwight Weller, “Morpholino Antisnse Oligomers : Design, Preparation, and Properties”, Antisense & Nucleic acid drug development 7:187-195(197), Mary Ann Liebert, Inc. • Tommi Ratilainen, “A Simple Model for Gene Targeting”, Biophysics Journal, 81, 2876-2885, 2001 • Liam Good, Satish Kumar Awasthi, Rikard Dryselius, Ola Larsson, and Peter E. Nielsen, “Bactericidal antisense effects of peptide-PNA conjugates”, Nature Biotechnology, Volume 19, April 2001. The program separates the target gene sequence from the off-target gene sequence.(Fig.2) If we take out the target gene sequence from the whole genome sequence by general dynamic programming algorithm, it requires gigabytes memory spaces which is inappropriate for the use in a personal computer. To solve this problem we divide the whole genome sequence with the length of the target gene sequence and choose the partitions that are highly matched with the fragments divided by the length of 10 nucleotides from the target gene sequence.(Fig.3) By this algorithm we can reduce the time and the storage complexity without compromising the accuracy.