1 / 20

About BoostThreader

About BoostThreader. Lee, Juyong 2009. 08. 26. What is BoostThreader?. A Sequence-Structure threading program Published by J. Xu’s group Known to be good for hard cases Does not work…… for me……. Let’s thread!. 준비물 : sequence protein structure scoring function algorithm.

eron
Télécharger la présentation

About BoostThreader

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. About BoostThreader Lee, Juyong 2009. 08. 26

  2. What is BoostThreader? • A Sequence-Structure threading program • Published by J. Xu’s group • Known to be good for hard cases • Does not work…… for me……

  3. Let’s thread! • 준비물 : • sequence • protein structure • scoring function • algorithm Deletion Match F C D E B G A BAD Good

  4. Three algorithms for Alignment! I’m Andrei AndreyevichMarkov. I’m your father • Generative model • Traditional • Hidden Markov Chain • Not that old • Conditional Random Field • Up to date • Dynamic programming

  5. Dynamic programing • Finding the best scoring path on the alignment matrix Initial Final The path The alignment!

  6. More about Dynamic Programming Follow the maximum scoring path! SEQUENCE deletion A ― g = Gap penalty = -1 g match f insertion STRUCTURE h A a F(i+1, j+1) ― a h = Gap penalty = -1

  7. In Conventional seq.-str. alignment • Linear sum of similarities of properties • Functions for Match and Gap cases are only needed! • Fmatch= w1*predicted SS * real SS + w2*predicted SA * real SA + w3*predicted residue depth * real depth + … • Fgap= Opening penalty+# of gaps * Extension penalty • Only consider next step!

  8. What’s different in BoostThreader? • Dependent on the current and next step both! • Nine scoring functions are necessary! • Gap penalty is context-dependent • Trained from reference alignments! • DALI, TMalign etc…… • Regression Trees are used as scoring function • Not Linear function!

  9. Regression Tree는 또 뭔가요?

  10. 쉬어가는페이지 Hey nature, Not all flies are not Drosophilia

  11. Regression Tree! 100대의중고차 Training! 1500cc가 넘는가? 아니요 예 5년이 넘었는가? 아니요 예 20만km이상 뛰었는가? 아니요 예 평균 8백만원 평균 5백만원 평균 15백만원 평균 11백만원

  12. Example in Threading Sequence – predicted properties Structure – observed properties SS 가같은가? 아니요 예 SA 정도가같은가? 아니요 예 SA 정도가 같은가? 아니요 예 확률 0.1 10개 중에 1개 확률 0.3 10개 중에 3개 확률 0.6 10개 중에 6개 확률 0.9 10 개중에 9개 Estimate Prob. from examples

  13. Advantage of Tree • Fast • Interaction between variables can be easily considered

  14. What’s really happening in BoostThreader? • Initial Setting • Set all F0 (uv,seq(i),str(j))= 0 • P ~ exp(F) • 30 개의 정답 Sequence-Structure alignment! • Calculate Prob. of all possible state transition! • Probabilities of all examples! • Forward-backward algorithm

  15. “All Possible” Transitions? For MM AB–DE a b c d– mmimd AB ab AB bc AB cd Generate examples! BD ab BD bc BD cd DE ab DE bc DE cd

  16. Examples(2) For MI AB–DE a b c d– mmimd B- ab B- bc B- cd Generate examples! A- ab A- bc A- cd D- ab D- bc D- cd E- ab E- bc E- cd

  17. Inside BoostThreader • Examples and their probabilities • Calculated with the current scoring functions • Modify Scoring Functions • 정답이면 F값 증가! : F1=F0 + (1 – P ) • 오답이면 F값 감소! : F1=F0- P • Addtrees until prediction quality doesn’t increase • F=F0+F1+F2+F3+F4+F5+……

  18. Performance

  19. Summary • BoostThreader considers “Current” and “Next” step • Scoring function consists of Regression Trees • Trees are trained based on Examples~

  20. 감사합니다!

More Related