1 / 8

Homology & Sequence Similarity

Homology & Sequence Similarity. Use these slides to go with question #13 for the pre-institute reading “What is a Gene”. Homology & Sequence Similarity.

Télécharger la présentation

Homology & Sequence Similarity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Homology & Sequence Similarity Use these slides to go with question #13 for the pre-institute reading “What is a Gene”

  2. Homology & Sequence Similarity I found genes from 4 genomes (Agrobacterium tumefaciens C58, Mycobacterium tuberculosis H37Rv, E. coli K12, & Methanococcus maripaludis S2) that are all named the same. Does that mean that these genes are homologs? What is a better indicator of homology at a sequence level?

  3. Homology & Sequence Similarity Here is some data on encoded protein sequence similarity using BLAST as an alignment tool (we will talk a lot about how BLAST works during the institute): %QUERY COVERAGE / %IDENTITY / %POSITIVES: Query SequenceMycobacteriumE. coliMethanococcus Agrobacterium 97/24/42 82/22/39 15/24/51 Mycobacterium 93/22/38 53/26/41 E. coli 25/29/42 % QUERY COVERAGE = what % of the query protein sequence aligned with the comparison protein sequence % IDENTITY = of those query amino acid residues in the alignment, what % were identical to the aligned residues in the comparison protein % POSITIVES = of those query amino acid residues in the alignment, what % were identical to or evolutionarily common replacements for the aligned residues in the comparison protein There is not a lot of sequence identity and not even a great amount of similarity (positives), so are these homologs or not? Maybe seeing the sequences aligned next to each other will help.

  4. >Agrobacterium tumefaciens C58 ------MTDIMK------------------------------P-DLRPGNTHFSSGPCSKRPGWSLDAL----------SDAPLGRSHRAKVGKAKLKQAIDLTREILNVPA-DYRIGIVPASDTGAVEMALWSLLGE-RGVDMLAWESFGAGWVTDVVKQLKLKDVRKFEA-----DY-GLLPNLAE-------VDFDRDVVFTWNGTTSGVRVANADFI--PADRKGLTICDATSAAFAQDM--DFTKLDVVTFSWQKVLGGEGGHGVIILSPRAVERLLSYSP-AWPLPKIFRMVSGGK-----LIEGI-FTGETINTPSMLCVEDYIDALLWAKNLGGLKALIGRADANAKVIYDFIEKNNW-IANLAVKPETRSNTSVCLKIVDPEVQALDAAAQADFAKGIVALLEKENVALDIGAYRDA-PSGLRIWAGATIETADMEAVMPWLAWAYQTQ-------------K-AALSKAAA >Bradyrhizobium japonicum USDA110 ------MTVA-K------------------------------P-ASRPNVPHFSSGPCAKRPGWNAQNL----------KDAALGRSHRAKVGKTKLKLAIDLTREVLEVPA-DYRIGIVPASDTGAVEMALWSLLGA-RPVTTLAWESFGEGWVSDIVKELKLKDVTKLNA-----AY-GEIPDLSK-------VDPKSDVVFTWNGTTSGVRVPNADWI--SATREGLTICDATSAAFAQAL--DWAKLDVVTFSWQKALGGEAAHGMLILSPRAVERLETYKP-AWPLPKIFRMTKGGK-----INEGI-FVGETINTPSMLCVEDYLDALNWAKSIGGLKALIARADANTKVLADWKAKTPW-IDFLAKDASIRSNTSVCLKFIDPALTALSDDAQAEFSKKLVALVEKEGAGYDFAYYRDA-PAGLRIWCGATVEARDVELLTQWIDWAFAET-------------K-AQ-LAKAA >Caulobacter sp. K31 ------MTTLAK------------------------------P-AQRPARPEFSSGPCAKRPGWTPENL----------RNAVLGRSHRSKLGKARLKAAIDQTRDVLEVPA-DFLIGIVPGSDTGAVEMAMWSMLGQ-RPVQLLAFESFGKDWVTDVTKQLKLPNVEVLDA-----PY-GQLPDTSK-------VDPAKDLVFTWNGTTSGVRVPNADFI--SADREGIVICDATSAAFAQDL--DWTKLDVVTFSWQKALGGEGAHGVLILSPRAVARLESYTP-AWPMPKLFRMTKANKDGGNKVALDI-FEGATINTPSMLCVEDALDALKWAASIGGLEAMQGRADQNLAVLADWVARTPW-VEFLAATPEIRSNTSVCLKVVDPAIAALSDDAQADFAKKLASLLEKEGAALDIGGYRDA-PAGLRIWCGATVEASDVEALTPWLDWAFATV-------------S---AELAAA >Mycobacterium tuberculosis H37Rv ------MADQLT-------------------------PHLEIPTAIKPRDGRFGSGPSKVRLE-QLQTLT-------TTAAALFGTSHRQAPVKNLVGRVRSGLAELFSLPD-GYEVILGNGGATAFWDAAAFGLIDK--RSLHLTYGEFSAKFASAVSKNPFVGEPIIITS-----DP-GSAPEPQT-------DPSVDVIAWAHNETSTGVAVAVRRPE--G-SDDALVVIDATSGAGGLPV--DIAETDAYYFAPQKNFASDGGLWLAIMSPAALSRIEAIAATGRWVPDFLSLPIAV--------ENS-LKNQTYNTPAIATLALLAEQIDWLVGNGGLDWAVKRTADSSQRLYSWAQERPY-TTPFVTDPGLRSQVVGTIDFVDDVDAG-----------TVAKI-LRANGIVDTEPYRKLGRNQLRVAMFPAVEPDDVSALTECVDWVV-----------------------ERL >Corynebacterium glutamicum ATCC13032 ------MTDFPT-----------------------------LPSEFIPGDGRFGCGPSKVRPE-QIQAIV-------DGSASVIGTSHRQPAVKNVVGSIREGLSDLFSLPE-GYEIILSLGGATAFWDAATFGLIEK--KSGHLSFGEFSSKFAKASKLAPWLDEPEIVTA-----ET-GDSPAPQA-------FEGADVIAWAHNETSTGAMVPVLRPE--G-SEGSLVAIDATSGAGGLPV--DIKNSDVYYFSPQKCFASDGGLWLAAMSPAALERIEKINASDRFIPEFLNLQTAV--------DNS-LKNQTYNTPAVATLLMLDNQVKWMNSNGGLDGMVARTTASSSALYNWAEAREE-ASPYVADAAKRSLVVGTIDFDDSIDAA-----------VIAKI-LRANGILDTEPYRKLGRNQLRIGMFPAIDSTDVEKLTGAIDFIL-------------------DGGFARK >Arthrobacter arilaitensis Re117 ------MSTDIK-----------------------------IPENLLPADGRFGAGPSKVRAE-QVQAIV-------DAGPELLGTSHRQAPVKNLVASVQDGLKEMFNAPA-GYEVLLGVGGSTAFWDAAAFSLVRS--KAQHLSFGEFGSKFAKATDKAPFLEASSIIVG-----EP-GTVPEPVA-------EADVDLYAWPHNETSTGAAAPIQRVA--GANADALVVIDATSAAGGLDV--DLAETDVYYFAPQKNFASDGGLWLAFVSPAAIARIEEIAATDRWIPDFLNLKTAL--------DNS-LKNQTYNTPSLTTLVGLDAQIKWINANGGLKWAAARTAESAGKIQAWAEASEI-AAPYVANPAHRSNVISTVDFADSVDAS-----------AIAKV-LRANGVVDVEPYRKLGRNQLRIATFVAIEPNDVESLLKCIDYVI-----------------------EQL >Propionibacterium acnes SK137 ------MPPNRVSLILPHDHSMNSAVIDDMICKDRIMAQPMIPRDLLPSDPRFGCGPSRIRRE-VVASLS-------E-PGSVMGTSHRQPPVRHVVAAIREELTELYNLPT-DYEVALGNGGATLFWDMATVSLVEK--RAATGVYGEFTRKFSSALQRAPFLADPAVFQA-----EP-GKLALPKA-------VSDVDTYAWAHNETSTGVVAPVRRPN--DIDDNSLVLVDATSAAGGVAA--DMSTIDAYYFSPQKNLSSDGGLWLAILSPAAIERSNRVTSSARWVPQMLDLSLAV--------TNS-RADQTLNTPALATLVMLEAQCRWLLDQGGMAWAASRTASTSGILYRWAEDNPL-TTPFVADPALRSPVVVTIDIDESVDAA-----------RLCAR-ARDNGILDIEPYRKLGRNQIRIATFSSIEPSDVEALTACLDWLL----------------------ENRD >Clostridium beijerinckii NCIMB8052 ------MSRV----------------------------------------YNFSAGPAVLPES-VLREAAGEMLDYKGTGMSVMEMSHRSKAFEEIITDAEKTLRELMNIPD-NYKVLFLQGGASQQFAMIPMNLMKNK-VVDHIITGQWAKKAASEAKI---FGKVNILASS-EDKTF-SYIPDLKD----LKVSEDADYVYICHNNTIYGTTYK--ELPNV---GDKILVADMSSDFLSEPV--DVSKYGLIFAGVQKNAGP-AGVVVVIIREDLITED--V---LPGTPTMLRYKVHA--------DN----KSLYNTPPAYGIYICGKVFKWVKNKGGLEAMKKINEEKASILYDFLDSSS-MFKGTV-VKKDRSLMNVPFVTGSDELDA-----------KFVKE-AKAVGFENLKGHRTVGG--MRASIYNAMPIEGVKDLVEFMRK----------------------FEEDNK >Flavobacterium johnsoniae UW101 ------MKK-----------------------------------------HNYSAGPSILPQE-VFEKASKAVLNFNDSGLSILEISHRSKDFVAVMDEARSLALELLGLQGKGYQALFLQGGASTAFLMAPYNLMKENGKAAYLDSGTWATAAIKEAKL---FGETVIVGSS-KDDNY-TYIPKGYE----I-PA-DADYFHCTSNNTIFGTQIQ--EFP----STNIPVVCDMSSDIFSREL--DFSKFDLIYAGAQKNMGP-AGTTLVVVKEEILGKN------GRTIPSMLDYAKHI--------KA----ESMYNTPSVFAVYVSLLTLQWIKAKGGIAAVEKLNNAKADLLYAEIDRNP-LFKGAA-NVEDRSKMNVTFLLNNPEHTE-----------TFDAL-WKAAGISGLPGHRSVGG--YRASIYNAMPIESVQVLVDVMKA-----------------------LESKV >Bacillus subtilis subsp. subtilis str. 168 ------MERT----------------------------------------TNFNAGPAALPLE-VLQKAQKEFIDFNESGMSVMELSHRSKEYEAVHQKAKSLLIELMGIPE-DYDILFLQGGASLQFSMLPMNFLTPEKTAHFVMTGAWSEKALAETKL---FGNTSITATS-ETDNY-SYIPEVDL----T-DVKDGAYLHITSNNTIFGTQWQ--EFP----NSPIPLVADMSSDILSRKI--DVSKFDVIYGGAQKNLGP-SGVTVVIMKKSWLQNE------NANVPKILKYSTHV--------KA----DSLYNTPPTFAIYMLSLVLEWLKENGGVEAVEQRNEQKAQVLYSCIDESNGFYKGHA-RKDSRSRMNVTFTLRDDELTK-----------TFVQK-AKDAKMIGLGGHRSVGG--CRASIYNAVSLEDCEKLAAFMKK----------------------FQQENE >Escherichia coli str. K-12 substr. MG1655 ------MAQI----------------------------------------FNFSSGPAMLPAE-VLKQAQQELRDWNGLGTSVMEVSHRGKEFIQVAEEAEKDFRDLLNVPS-NYKVLFCHGGGRGQFAAVPLNILGDKTTADYVDAGYWAASAIKEAKK---YCTPNVFDAKVTVDGLRAVKPMREW----Q-LSDNAAYMHYCPNETIDGIAID--ETPDF--GADVVVAADFSSTILSRPI--DVSRYGVIYAGAQKNIGP-AGLTIVIVREDLLGKA------NIACPSILDYSILN--------DN----GSMFNTPPTFAWYLSGLVFKWLKANGGVAEMDKINQQKAELLYGVIDNSD-FYRNDV-AKANRSRMNVPFQLADSALDK-----------LFLEE-SFAAGLHALKGHRVVGG--MRASIYNAMPLEGVKALTDFMVE----------------------FERRHG >Archaeoglobus fulgidus DSM 4304 ------MLL-------------------------------------------MIPGPVQLHER-IIRAMA------------RQMIGHRTADFSAIMEFCVEKLREIFGTKG-DIC--LISGSGTAGMEAAIASFSRV-KKIATLENGKFGERLGDIAERYT---QVERVKV-----PW-GESFELDAVKEAL--DNGCEAVAFVHNETSTGILNPAKEIAKLAKEYDALVIMDAITSAGGDYVKMDEWGVDVAIVGSQKCLGAPPGLAAVAVSEKAWDYYNER------CPYYLDLAAYR--------KKL-KDMQTPYTPAVPLFFALAEALKIIDEE-GLENRIQRHRILSAAVRKWAVEAGLELFPNLNKYSSYSNTVTAIKMPEGVSDS-----------ELRGTLKKEYGILVSGGQGELKGKIFRIGTMGNVGKFETVSTLAALEDVLMRK---NA-IKPALQY-AQILLRDLQ >Methanococcus maripaludis strain S2 RDLMKQMDTE---------------------------------------KLLMIPGPTMVPSR-VLNTMA------------LPIIGHRTSDFGDLTGDTVDMMKKVFQTEN-DTY--IITGSGTAVMDMAISNTLDKGDKVINITNGNFGERFYKISSVYK--ADTIKYEP-----EW-GDLADPKKLRELLEENEGIKAVTVVHNETSTGAKNPIEDLGNVVKDFDAIYIVDTISSLGGDYVNVDKFNIDICVTGSQKCIAAPPGLAAITVGEKAWDVVSKT-ET---KSFYLDLNAYK--------KSWDAKKETPYTPSVSLTYAMNEALEMVLEE-GLENRFKRHDLLARATRAGLEAMGL---ELFAKERARSVTVTSAKYPEGIDDK-----------KFRGLLAEKYNIRVAGGQSHLAGKIFRVGHMGSAKEYQVLGTLAAIELAFKEL----G-YNAEGGVA-AAKKVLSN >Ammonifex degensii KC4 ------MKKE---------------------------------------TRLFIPGPTPVPPA-VAEAMA------------RPLIGHRTEDFARLYARLEERLRVVLGTKN-DIV--ILTSSGTGGMEAAVANLVSPGDPVLALVTGKFGERFAELAKVYG--GAVEVMEF-----GW-GKAVDLEAVEEKL-KARRFKVVLATHNETSTTVVNDIRGLGELTRRYGALLVVDAVSSAGGMEIRMDDWGVDVLVTASQKALMVPPGLAIVAASDAAWKAMEEN-KN---PRYYLDLLAAR--------KSK-QKYNTPYTPAVSLFVGLDRALDLILAE-GLEKVYRKHRLLARAVRAAIRALGLKLM---IPDEYASPVVTGVWAPEGIEVD-----------RLRKEIASRYGVLLAGGQGPLKGKIFRISHMGYVDAVDILGALGALELGLYRFGFKFKLGEGLAQAQAVLAEEGEE >Agrobacterium tumefaciens C58 ------MTDIMK------------------------------P-DLRPGNTHFSSGPCSKRPGWSLDAL----------SDAPLGRSHRAKVGKAKLKQAIDLTREILNVPA-DYRIGIVPASDTGAVEMALWSLLGE-RGVDMLAWESFGAGWVTDVVKQLKLKDVRKFEA-----DY-GLLPNLAE-------VDFDRDVVFTWNGTTSGVRVANADFI--PADRKGLTICDATSAAFAQDM--DFTKLDVVTFSWQKVLGGEGGHGVIILSPRAVERLLSYSP-AWPLPKIFRMVSGGK-----LIEGI-FTGETINTPSMLCVEDYIDALLWAKNLGGLKALIGRADANAKVIYDFIEKNNW-IANLAVKPETRSNTSVCLKIVDPEVQALDAAAQADFAKGIVALLEKENVALDIGAYRDA-PSGLRIWAGATIETADMEAVMPWLAWAYQTQ-------------K-AALSKAAA >Bradyrhizobium japonicum USDA110 ------MTVA-K------------------------------P-ASRPNVPHFSSGPCAKRPGWNAQNL----------KDAALGRSHRAKVGKTKLKLAIDLTREVLEVPA-DYRIGIVPASDTGAVEMALWSLLGA-RPVTTLAWESFGEGWVSDIVKELKLKDVTKLNA-----AY-GEIPDLSK-------VDPKSDVVFTWNGTTSGVRVPNADWI--SATREGLTICDATSAAFAQAL--DWAKLDVVTFSWQKALGGEAAHGMLILSPRAVERLETYKP-AWPLPKIFRMTKGGK-----INEGI-FVGETINTPSMLCVEDYLDALNWAKSIGGLKALIARADANTKVLADWKAKTPW-IDFLAKDASIRSNTSVCLKFIDPALTALSDDAQAEFSKKLVALVEKEGAGYDFAYYRDA-PAGLRIWCGATVEARDVELLTQWIDWAFAET-------------K-AQ-LAKAA >Caulobacter sp. K31 ------MTTLAK------------------------------P-AQRPARPEFSSGPCAKRPGWTPENL----------RNAVLGRSHRSKLGKARLKAAIDQTRDVLEVPA-DFLIGIVPGSDTGAVEMAMWSMLGQ-RPVQLLAFESFGKDWVTDVTKQLKLPNVEVLDA-----PY-GQLPDTSK-------VDPAKDLVFTWNGTTSGVRVPNADFI--SADREGIVICDATSAAFAQDL--DWTKLDVVTFSWQKALGGEGAHGVLILSPRAVARLESYTP-AWPMPKLFRMTKANKDGGNKVALDI-FEGATINTPSMLCVEDALDALKWAASIGGLEAMQGRADQNLAVLADWVARTPW-VEFLAATPEIRSNTSVCLKVVDPAIAALSDDAQADFAKKLASLLEKEGAALDIGGYRDA-PAGLRIWCGATVEASDVEALTPWLDWAFATV-------------S---AELAAA >Mycobacterium tuberculosis H37Rv ------MADQLT-------------------------PHLEIPTAIKPRDGRFGSGPSKVRLE-QLQTLT-------TTAAALFGTSHRQAPVKNLVGRVRSGLAELFSLPD-GYEVILGNGGATAFWDAAAFGLIDK--RSLHLTYGEFSAKFASAVSKNPFVGEPIIITS-----DP-GSAPEPQT-------DPSVDVIAWAHNETSTGVAVAVRRPE--G-SDDALVVIDATSGAGGLPV--DIAETDAYYFAPQKNFASDGGLWLAIMSPAALSRIEAIAATGRWVPDFLSLPIAV--------ENS-LKNQTYNTPAIATLALLAEQIDWLVGNGGLDWAVKRTADSSQRLYSWAQERPY-TTPFVTDPGLRSQVVGTIDFVDDVDAG-----------TVAKI-LRANGIVDTEPYRKLGRNQLRVAMFPAVEPDDVSALTECVDWVV-----------------------ERL >Corynebacterium glutamicum ATCC13032 ------MTDFPT-----------------------------LPSEFIPGDGRFGCGPSKVRPE-QIQAIV-------DGSASVIGTSHRQPAVKNVVGSIREGLSDLFSLPE-GYEIILSLGGATAFWDAATFGLIEK--KSGHLSFGEFSSKFAKASKLAPWLDEPEIVTA-----ET-GDSPAPQA-------FEGADVIAWAHNETSTGAMVPVLRPE--G-SEGSLVAIDATSGAGGLPV--DIKNSDVYYFSPQKCFASDGGLWLAAMSPAALERIEKINASDRFIPEFLNLQTAV--------DNS-LKNQTYNTPAVATLLMLDNQVKWMNSNGGLDGMVARTTASSSALYNWAEAREE-ASPYVADAAKRSLVVGTIDFDDSIDAA-----------VIAKI-LRANGILDTEPYRKLGRNQLRIGMFPAIDSTDVEKLTGAIDFIL-------------------DGGFARK >Arthrobacter arilaitensis Re117 ------MSTDIK-----------------------------IPENLLPADGRFGAGPSKVRAE-QVQAIV-------DAGPELLGTSHRQAPVKNLVASVQDGLKEMFNAPA-GYEVLLGVGGSTAFWDAAAFSLVRS--KAQHLSFGEFGSKFAKATDKAPFLEASSIIVG-----EP-GTVPEPVA-------EADVDLYAWPHNETSTGAAAPIQRVA--GANADALVVIDATSAAGGLDV--DLAETDVYYFAPQKNFASDGGLWLAFVSPAAIARIEEIAATDRWIPDFLNLKTAL--------DNS-LKNQTYNTPSLTTLVGLDAQIKWINANGGLKWAAARTAESAGKIQAWAEASEI-AAPYVANPAHRSNVISTVDFADSVDAS-----------AIAKV-LRANGVVDVEPYRKLGRNQLRIATFVAIEPNDVESLLKCIDYVI-----------------------EQL >Propionibacterium acnes SK137 ------MPPNRVSLILPHDHSMNSAVIDDMICKDRIMAQPMIPRDLLPSDPRFGCGPSRIRRE-VVASLS-------E-PGSVMGTSHRQPPVRHVVAAIREELTELYNLPT-DYEVALGNGGATLFWDMATVSLVEK--RAATGVYGEFTRKFSSALQRAPFLADPAVFQA-----EP-GKLALPKA-------VSDVDTYAWAHNETSTGVVAPVRRPN--DIDDNSLVLVDATSAAGGVAA--DMSTIDAYYFSPQKNLSSDGGLWLAILSPAAIERSNRVTSSARWVPQMLDLSLAV--------TNS-RADQTLNTPALATLVMLEAQCRWLLDQGGMAWAASRTASTSGILYRWAEDNPL-TTPFVADPALRSPVVVTIDIDESVDAA-----------RLCAR-ARDNGILDIEPYRKLGRNQIRIATFSSIEPSDVEALTACLDWLL----------------------ENRD >Clostridium beijerinckii NCIMB8052 ------MSRV----------------------------------------YNFSAGPAVLPES-VLREAAGEMLDYKGTGMSVMEMSHRSKAFEEIITDAEKTLRELMNIPD-NYKVLFLQGGASQQFAMIPMNLMKNK-VVDHIITGQWAKKAASEAKI---FGKVNILASS-EDKTF-SYIPDLKD----LKVSEDADYVYICHNNTIYGTTYK--ELPNV---GDKILVADMSSDFLSEPV--DVSKYGLIFAGVQKNAGP-AGVVVVIIREDLITED--V---LPGTPTMLRYKVHA--------DN----KSLYNTPPAYGIYICGKVFKWVKNKGGLEAMKKINEEKASILYDFLDSSS-MFKGTV-VKKDRSLMNVPFVTGSDELDA-----------KFVKE-AKAVGFENLKGHRTVGG--MRASIYNAMPIEGVKDLVEFMRK----------------------FEEDNK >Flavobacterium johnsoniae UW101 ------MKK-----------------------------------------HNYSAGPSILPQE-VFEKASKAVLNFNDSGLSILEISHRSKDFVAVMDEARSLALELLGLQGKGYQALFLQGGASTAFLMAPYNLMKENGKAAYLDSGTWATAAIKEAKL---FGETVIVGSS-KDDNY-TYIPKGYE----I-PA-DADYFHCTSNNTIFGTQIQ--EFP----STNIPVVCDMSSDIFSREL--DFSKFDLIYAGAQKNMGP-AGTTLVVVKEEILGKN------GRTIPSMLDYAKHI--------KA----ESMYNTPSVFAVYVSLLTLQWIKAKGGIAAVEKLNNAKADLLYAEIDRNP-LFKGAA-NVEDRSKMNVTFLLNNPEHTE-----------TFDAL-WKAAGISGLPGHRSVGG--YRASIYNAMPIESVQVLVDVMKA-----------------------LESKV >Bacillus subtilis subsp. subtilis str. 168 ------MERT----------------------------------------TNFNAGPAALPLE-VLQKAQKEFIDFNESGMSVMELSHRSKEYEAVHQKAKSLLIELMGIPE-DYDILFLQGGASLQFSMLPMNFLTPEKTAHFVMTGAWSEKALAETKL---FGNTSITATS-ETDNY-SYIPEVDL----T-DVKDGAYLHITSNNTIFGTQWQ--EFP----NSPIPLVADMSSDILSRKI--DVSKFDVIYGGAQKNLGP-SGVTVVIMKKSWLQNE------NANVPKILKYSTHV--------KA----DSLYNTPPTFAIYMLSLVLEWLKENGGVEAVEQRNEQKAQVLYSCIDESNGFYKGHA-RKDSRSRMNVTFTLRDDELTK-----------TFVQK-AKDAKMIGLGGHRSVGG--CRASIYNAVSLEDCEKLAAFMKK----------------------FQQENE >Escherichia coli str. K-12 substr. MG1655 ------MAQI----------------------------------------FNFSSGPAMLPAE-VLKQAQQELRDWNGLGTSVMEVSHRGKEFIQVAEEAEKDFRDLLNVPS-NYKVLFCHGGGRGQFAAVPLNILGDKTTADYVDAGYWAASAIKEAKK---YCTPNVFDAKVTVDGLRAVKPMREW----Q-LSDNAAYMHYCPNETIDGIAID--ETPDF--GADVVVAADFSSTILSRPI--DVSRYGVIYAGAQKNIGP-AGLTIVIVREDLLGKA------NIACPSILDYSILN--------DN----GSMFNTPPTFAWYLSGLVFKWLKANGGVAEMDKINQQKAELLYGVIDNSD-FYRNDV-AKANRSRMNVPFQLADSALDK-----------LFLEE-SFAAGLHALKGHRVVGG--MRASIYNAMPLEGVKALTDFMVE----------------------FERRHG >Archaeoglobus fulgidus DSM 4304 ------MLL-------------------------------------------MIPGPVQLHER-IIRAMA------------RQMIGHRTADFSAIMEFCVEKLREIFGTKG-DIC--LISGSGTAGMEAAIASFSRV-KKIATLENGKFGERLGDIAERYT---QVERVKV-----PW-GESFELDAVKEAL--DNGCEAVAFVHNETSTGILNPAKEIAKLAKEYDALVIMDAITSAGGDYVKMDEWGVDVAIVGSQKCLGAPPGLAAVAVSEKAWDYYNER------CPYYLDLAAYR--------KKL-KDMQTPYTPAVPLFFALAEALKIIDEE-GLENRIQRHRILSAAVRKWAVEAGLELFPNLNKYSSYSNTVTAIKMPEGVSDS-----------ELRGTLKKEYGILVSGGQGELKGKIFRIGTMGNVGKFETVSTLAALEDVLMRK---NA-IKPALQY-AQILLRDLQ >Methanococcus maripaludis strain S2 RDLMKQMDTE---------------------------------------KLLMIPGPTMVPSR-VLNTMA------------LPIIGHRTSDFGDLTGDTVDMMKKVFQTEN-DTY--IITGSGTAVMDMAISNTLDKGDKVINITNGNFGERFYKISSVYK--ADTIKYEP-----EW-GDLADPKKLRELLEENEGIKAVTVVHNETSTGAKNPIEDLGNVVKDFDAIYIVDTISSLGGDYVNVDKFNIDICVTGSQKCIAAPPGLAAITVGEKAWDVVSKT-ET---KSFYLDLNAYK--------KSWDAKKETPYTPSVSLTYAMNEALEMVLEE-GLENRFKRHDLLARATRAGLEAMGL---ELFAKERARSVTVTSAKYPEGIDDK-----------KFRGLLAEKYNIRVAGGQSHLAGKIFRVGHMGSAKEYQVLGTLAAIELAFKEL----G-YNAEGGVA-AAKKVLSN >Ammonifex degensii KC4 ------MKKE---------------------------------------TRLFIPGPTPVPPA-VAEAMA------------RPLIGHRTEDFARLYARLEERLRVVLGTKN-DIV--ILTSSGTGGMEAAVANLVSPGDPVLALVTGKFGERFAELAKVYG--GAVEVMEF-----GW-GKAVDLEAVEEKL-KARRFKVVLATHNETSTTVVNDIRGLGELTRRYGALLVVDAVSSAGGMEIRMDDWGVDVLVTASQKALMVPPGLAIVAASDAAWKAMEEN-KN---PRYYLDLLAAR--------KSK-QKYNTPYTPAVSLFVGLDRALDLILAE-GLEKVYRKHRLLARAVRAAIRALGLKLM---IPDEYASPVVTGVWAPEGIEVD-----------RLRKEIASRYGVLLAGGQGPLKGKIFRISHMGYVDAVDILGALGALELGLYRFGFKFKLGEGLAQAQAVLAEEGEE Each of my 4 original sequences defines a “group”, so I have shown a global alignment of 3-4 sequences from each “group”. The gray columns indicate the only places where all sequences are identical.

  5. >Agrobacterium tumefaciens C58 ------MTDIMK------------------------------P-DLRPGNTHFSSGPCSKRPGWSLDAL----------SDAPLGRSHRAKVGKAKLKQAIDLTREILNVPA-DYRIGIVPASDTGAVEMALWSLLGE-RGVDMLAWESFGAGWVTDVVKQLKLKDVRKFEA-----DY-GLLPNLAE-------VDFDRDVVFTWNGTTSGVRVANADFI--PADRKGLTICDATSAAFAQDM--DFTKLDVVTFSWQKVLGGEGGHGVIILSPRAVERLLSYSP-AWPLPKIFRMVSGGK-----LIEGI-FTGETINTPSMLCVEDYIDALLWAKNLGGLKALIGRADANAKVIYDFIEKNNW-IANLAVKPETRSNTSVCLKIVDPEVQALDAAAQADFAKGIVALLEKENVALDIGAYRDA-PSGLRIWAGATIETADMEAVMPWLAWAYQTQ-------------K-AALSKAAA >Bradyrhizobium japonicum USDA110 ------MTVA-K------------------------------P-ASRPNVPHFSSGPCAKRPGWNAQNL----------KDAALGRSHRAKVGKTKLKLAIDLTREVLEVPA-DYRIGIVPASDTGAVEMALWSLLGA-RPVTTLAWESFGEGWVSDIVKELKLKDVTKLNA-----AY-GEIPDLSK-------VDPKSDVVFTWNGTTSGVRVPNADWI--SATREGLTICDATSAAFAQAL--DWAKLDVVTFSWQKALGGEAAHGMLILSPRAVERLETYKP-AWPLPKIFRMTKGGK-----INEGI-FVGETINTPSMLCVEDYLDALNWAKSIGGLKALIARADANTKVLADWKAKTPW-IDFLAKDASIRSNTSVCLKFIDPALTALSDDAQAEFSKKLVALVEKEGAGYDFAYYRDA-PAGLRIWCGATVEARDVELLTQWIDWAFAET-------------K-AQ-LAKAA >Caulobacter sp. K31 ------MTTLAK------------------------------P-AQRPARPEFSSGPCAKRPGWTPENL----------RNAVLGRSHRSKLGKARLKAAIDQTRDVLEVPA-DFLIGIVPGSDTGAVEMAMWSMLGQ-RPVQLLAFESFGKDWVTDVTKQLKLPNVEVLDA-----PY-GQLPDTSK-------VDPAKDLVFTWNGTTSGVRVPNADFI--SADREGIVICDATSAAFAQDL--DWTKLDVVTFSWQKALGGEGAHGVLILSPRAVARLESYTP-AWPMPKLFRMTKANKDGGNKVALDI-FEGATINTPSMLCVEDALDALKWAASIGGLEAMQGRADQNLAVLADWVARTPW-VEFLAATPEIRSNTSVCLKVVDPAIAALSDDAQADFAKKLASLLEKEGAALDIGGYRDA-PAGLRIWCGATVEASDVEALTPWLDWAFATV-------------S---AELAAA >Mycobacterium tuberculosis H37Rv ------MADQLT-------------------------PHLEIPTAIKPRDGRFGSGPSKVRLE-QLQTLT-------TTAAALFGTSHRQAPVKNLVGRVRSGLAELFSLPD-GYEVILGNGGATAFWDAAAFGLIDK--RSLHLTYGEFSAKFASAVSKNPFVGEPIIITS-----DP-GSAPEPQT-------DPSVDVIAWAHNETSTGVAVAVRRPE--G-SDDALVVIDATSGAGGLPV--DIAETDAYYFAPQKNFASDGGLWLAIMSPAALSRIEAIAATGRWVPDFLSLPIAV--------ENS-LKNQTYNTPAIATLALLAEQIDWLVGNGGLDWAVKRTADSSQRLYSWAQERPY-TTPFVTDPGLRSQVVGTIDFVDDVDAG-----------TVAKI-LRANGIVDTEPYRKLGRNQLRVAMFPAVEPDDVSALTECVDWVV-----------------------ERL >Corynebacterium glutamicum ATCC13032 ------MTDFPT-----------------------------LPSEFIPGDGRFGCGPSKVRPE-QIQAIV-------DGSASVIGTSHRQPAVKNVVGSIREGLSDLFSLPE-GYEIILSLGGATAFWDAATFGLIEK--KSGHLSFGEFSSKFAKASKLAPWLDEPEIVTA-----ET-GDSPAPQA-------FEGADVIAWAHNETSTGAMVPVLRPE--G-SEGSLVAIDATSGAGGLPV--DIKNSDVYYFSPQKCFASDGGLWLAAMSPAALERIEKINASDRFIPEFLNLQTAV--------DNS-LKNQTYNTPAVATLLMLDNQVKWMNSNGGLDGMVARTTASSSALYNWAEAREE-ASPYVADAAKRSLVVGTIDFDDSIDAA-----------VIAKI-LRANGILDTEPYRKLGRNQLRIGMFPAIDSTDVEKLTGAIDFIL-------------------DGGFARK >Arthrobacter arilaitensis Re117 ------MSTDIK-----------------------------IPENLLPADGRFGAGPSKVRAE-QVQAIV-------DAGPELLGTSHRQAPVKNLVASVQDGLKEMFNAPA-GYEVLLGVGGSTAFWDAAAFSLVRS--KAQHLSFGEFGSKFAKATDKAPFLEASSIIVG-----EP-GTVPEPVA-------EADVDLYAWPHNETSTGAAAPIQRVA--GANADALVVIDATSAAGGLDV--DLAETDVYYFAPQKNFASDGGLWLAFVSPAAIARIEEIAATDRWIPDFLNLKTAL--------DNS-LKNQTYNTPSLTTLVGLDAQIKWINANGGLKWAAARTAESAGKIQAWAEASEI-AAPYVANPAHRSNVISTVDFADSVDAS-----------AIAKV-LRANGVVDVEPYRKLGRNQLRIATFVAIEPNDVESLLKCIDYVI-----------------------EQL >Propionibacterium acnes SK137 ------MPPNRVSLILPHDHSMNSAVIDDMICKDRIMAQPMIPRDLLPSDPRFGCGPSRIRRE-VVASLS-------E-PGSVMGTSHRQPPVRHVVAAIREELTELYNLPT-DYEVALGNGGATLFWDMATVSLVEK--RAATGVYGEFTRKFSSALQRAPFLADPAVFQA-----EP-GKLALPKA-------VSDVDTYAWAHNETSTGVVAPVRRPN--DIDDNSLVLVDATSAAGGVAA--DMSTIDAYYFSPQKNLSSDGGLWLAILSPAAIERSNRVTSSARWVPQMLDLSLAV--------TNS-RADQTLNTPALATLVMLEAQCRWLLDQGGMAWAASRTASTSGILYRWAEDNPL-TTPFVADPALRSPVVVTIDIDESVDAA-----------RLCAR-ARDNGILDIEPYRKLGRNQIRIATFSSIEPSDVEALTACLDWLL----------------------ENRD >Clostridium beijerinckii NCIMB8052 ------MSRV----------------------------------------YNFSAGPAVLPES-VLREAAGEMLDYKGTGMSVMEMSHRSKAFEEIITDAEKTLRELMNIPD-NYKVLFLQGGASQQFAMIPMNLMKNK-VVDHIITGQWAKKAASEAKI---FGKVNILASS-EDKTF-SYIPDLKD----LKVSEDADYVYICHNNTIYGTTYK--ELPNV---GDKILVADMSSDFLSEPV--DVSKYGLIFAGVQKNAGP-AGVVVVIIREDLITED--V---LPGTPTMLRYKVHA--------DN----KSLYNTPPAYGIYICGKVFKWVKNKGGLEAMKKINEEKASILYDFLDSSS-MFKGTV-VKKDRSLMNVPFVTGSDELDA-----------KFVKE-AKAVGFENLKGHRTVGG--MRASIYNAMPIEGVKDLVEFMRK----------------------FEEDNK >Flavobacterium johnsoniae UW101 ------MKK-----------------------------------------HNYSAGPSILPQE-VFEKASKAVLNFNDSGLSILEISHRSKDFVAVMDEARSLALELLGLQGKGYQALFLQGGASTAFLMAPYNLMKENGKAAYLDSGTWATAAIKEAKL---FGETVIVGSS-KDDNY-TYIPKGYE----I-PA-DADYFHCTSNNTIFGTQIQ--EFP----STNIPVVCDMSSDIFSREL--DFSKFDLIYAGAQKNMGP-AGTTLVVVKEEILGKN------GRTIPSMLDYAKHI--------KA----ESMYNTPSVFAVYVSLLTLQWIKAKGGIAAVEKLNNAKADLLYAEIDRNP-LFKGAA-NVEDRSKMNVTFLLNNPEHTE-----------TFDAL-WKAAGISGLPGHRSVGG--YRASIYNAMPIESVQVLVDVMKA-----------------------LESKV >Bacillus subtilis subsp. subtilis str. 168 ------MERT----------------------------------------TNFNAGPAALPLE-VLQKAQKEFIDFNESGMSVMELSHRSKEYEAVHQKAKSLLIELMGIPE-DYDILFLQGGASLQFSMLPMNFLTPEKTAHFVMTGAWSEKALAETKL---FGNTSITATS-ETDNY-SYIPEVDL----T-DVKDGAYLHITSNNTIFGTQWQ--EFP----NSPIPLVADMSSDILSRKI--DVSKFDVIYGGAQKNLGP-SGVTVVIMKKSWLQNE------NANVPKILKYSTHV--------KA----DSLYNTPPTFAIYMLSLVLEWLKENGGVEAVEQRNEQKAQVLYSCIDESNGFYKGHA-RKDSRSRMNVTFTLRDDELTK-----------TFVQK-AKDAKMIGLGGHRSVGG--CRASIYNAVSLEDCEKLAAFMKK----------------------FQQENE >Escherichia coli str. K-12 substr. MG1655 ------MAQI----------------------------------------FNFSSGPAMLPAE-VLKQAQQELRDWNGLGTSVMEVSHRGKEFIQVAEEAEKDFRDLLNVPS-NYKVLFCHGGGRGQFAAVPLNILGDKTTADYVDAGYWAASAIKEAKK---YCTPNVFDAKVTVDGLRAVKPMREW----Q-LSDNAAYMHYCPNETIDGIAID--ETPDF--GADVVVAADFSSTILSRPI--DVSRYGVIYAGAQKNIGP-AGLTIVIVREDLLGKA------NIACPSILDYSILN--------DN----GSMFNTPPTFAWYLSGLVFKWLKANGGVAEMDKINQQKAELLYGVIDNSD-FYRNDV-AKANRSRMNVPFQLADSALDK-----------LFLEE-SFAAGLHALKGHRVVGG--MRASIYNAMPLEGVKALTDFMVE----------------------FERRHG >Archaeoglobus fulgidus DSM 4304 ------MLL-------------------------------------------MIPGPVQLHER-IIRAMA------------RQMIGHRTADFSAIMEFCVEKLREIFGTKG-DIC--LISGSGTAGMEAAIASFSRV-KKIATLENGKFGERLGDIAERYT---QVERVKV-----PW-GESFELDAVKEAL--DNGCEAVAFVHNETSTGILNPAKEIAKLAKEYDALVIMDAITSAGGDYVKMDEWGVDVAIVGSQKCLGAPPGLAAVAVSEKAWDYYNER------CPYYLDLAAYR--------KKL-KDMQTPYTPAVPLFFALAEALKIIDEE-GLENRIQRHRILSAAVRKWAVEAGLELFPNLNKYSSYSNTVTAIKMPEGVSDS-----------ELRGTLKKEYGILVSGGQGELKGKIFRIGTMGNVGKFETVSTLAALEDVLMRK---NA-IKPALQY-AQILLRDLQ >Methanococcus maripaludis strain S2 RDLMKQMDTE---------------------------------------KLLMIPGPTMVPSR-VLNTMA------------LPIIGHRTSDFGDLTGDTVDMMKKVFQTEN-DTY--IITGSGTAVMDMAISNTLDKGDKVINITNGNFGERFYKISSVYK--ADTIKYEP-----EW-GDLADPKKLRELLEENEGIKAVTVVHNETSTGAKNPIEDLGNVVKDFDAIYIVDTISSLGGDYVNVDKFNIDICVTGSQKCIAAPPGLAAITVGEKAWDVVSKT-ET---KSFYLDLNAYK--------KSWDAKKETPYTPSVSLTYAMNEALEMVLEE-GLENRFKRHDLLARATRAGLEAMGL---ELFAKERARSVTVTSAKYPEGIDDK-----------KFRGLLAEKYNIRVAGGQSHLAGKIFRVGHMGSAKEYQVLGTLAAIELAFKEL----G-YNAEGGVA-AAKKVLSN >Ammonifex degensii KC4 ------MKKE---------------------------------------TRLFIPGPTPVPPA-VAEAMA------------RPLIGHRTEDFARLYARLEERLRVVLGTKN-DIV--ILTSSGTGGMEAAVANLVSPGDPVLALVTGKFGERFAELAKVYG--GAVEVMEF-----GW-GKAVDLEAVEEKL-KARRFKVVLATHNETSTTVVNDIRGLGELTRRYGALLVVDAVSSAGGMEIRMDDWGVDVLVTASQKALMVPPGLAIVAASDAAWKAMEEN-KN---PRYYLDLLAAR--------KSK-QKYNTPYTPAVSLFVGLDRALDLILAE-GLEKVYRKHRLLARAVRAAIRALGLKLM---IPDEYASPVVTGVWAPEGIEVD-----------RLRKEIASRYGVLLAGGQGPLKGKIFRISHMGYVDAVDILGALGALELGLYRFGFKFKLGEGLAQAQAVLAEEGEE >Agrobacterium tumefaciens C58 ------MTDIMK------------------------------P-DLRPGNTHFSSGPCSKRPGWSLDAL----------SDAPLGRSHRAKVGKAKLKQAIDLTREILNVPA-DYRIGIVPASDTGAVEMALWSLLGE-RGVDMLAWESFGAGWVTDVVKQLKLKDVRKFEA-----DY-GLLPNLAE-------VDFDRDVVFTWNGTTSGVRVANADFI--PADRKGLTICDATSAAFAQDM--DFTKLDVVTFSWQKVLGGEGGHGVIILSPRAVERLLSYSP-AWPLPKIFRMVSGGK-----LIEGI-FTGETINTPSMLCVEDYIDALLWAKNLGGLKALIGRADANAKVIYDFIEKNNW-IANLAVKPETRSNTSVCLKIVDPEVQALDAAAQADFAKGIVALLEKENVALDIGAYRDA-PSGLRIWAGATIETADMEAVMPWLAWAYQTQ-------------K-AALSKAAA >Bradyrhizobium japonicum USDA110 ------MTVA-K------------------------------P-ASRPNVPHFSSGPCAKRPGWNAQNL----------KDAALGRSHRAKVGKTKLKLAIDLTREVLEVPA-DYRIGIVPASDTGAVEMALWSLLGA-RPVTTLAWESFGEGWVSDIVKELKLKDVTKLNA-----AY-GEIPDLSK-------VDPKSDVVFTWNGTTSGVRVPNADWI--SATREGLTICDATSAAFAQAL--DWAKLDVVTFSWQKALGGEAAHGMLILSPRAVERLETYKP-AWPLPKIFRMTKGGK-----INEGI-FVGETINTPSMLCVEDYLDALNWAKSIGGLKALIARADANTKVLADWKAKTPW-IDFLAKDASIRSNTSVCLKFIDPALTALSDDAQAEFSKKLVALVEKEGAGYDFAYYRDA-PAGLRIWCGATVEARDVELLTQWIDWAFAET-------------K-AQ-LAKAA >Caulobacter sp. K31 ------MTTLAK------------------------------P-AQRPARPEFSSGPCAKRPGWTPENL----------RNAVLGRSHRSKLGKARLKAAIDQTRDVLEVPA-DFLIGIVPGSDTGAVEMAMWSMLGQ-RPVQLLAFESFGKDWVTDVTKQLKLPNVEVLDA-----PY-GQLPDTSK-------VDPAKDLVFTWNGTTSGVRVPNADFI--SADREGIVICDATSAAFAQDL--DWTKLDVVTFSWQKALGGEGAHGVLILSPRAVARLESYTP-AWPMPKLFRMTKANKDGGNKVALDI-FEGATINTPSMLCVEDALDALKWAASIGGLEAMQGRADQNLAVLADWVARTPW-VEFLAATPEIRSNTSVCLKVVDPAIAALSDDAQADFAKKLASLLEKEGAALDIGGYRDA-PAGLRIWCGATVEASDVEALTPWLDWAFATV-------------S---AELAAA >Mycobacterium tuberculosis H37Rv ------MADQLT-------------------------PHLEIPTAIKPRDGRFGSGPSKVRLE-QLQTLT-------TTAAALFGTSHRQAPVKNLVGRVRSGLAELFSLPD-GYEVILGNGGATAFWDAAAFGLIDK--RSLHLTYGEFSAKFASAVSKNPFVGEPIIITS-----DP-GSAPEPQT-------DPSVDVIAWAHNETSTGVAVAVRRPE--G-SDDALVVIDATSGAGGLPV--DIAETDAYYFAPQKNFASDGGLWLAIMSPAALSRIEAIAATGRWVPDFLSLPIAV--------ENS-LKNQTYNTPAIATLALLAEQIDWLVGNGGLDWAVKRTADSSQRLYSWAQERPY-TTPFVTDPGLRSQVVGTIDFVDDVDAG-----------TVAKI-LRANGIVDTEPYRKLGRNQLRVAMFPAVEPDDVSALTECVDWVV-----------------------ERL >Corynebacterium glutamicum ATCC13032 ------MTDFPT-----------------------------LPSEFIPGDGRFGCGPSKVRPE-QIQAIV-------DGSASVIGTSHRQPAVKNVVGSIREGLSDLFSLPE-GYEIILSLGGATAFWDAATFGLIEK--KSGHLSFGEFSSKFAKASKLAPWLDEPEIVTA-----ET-GDSPAPQA-------FEGADVIAWAHNETSTGAMVPVLRPE--G-SEGSLVAIDATSGAGGLPV--DIKNSDVYYFSPQKCFASDGGLWLAAMSPAALERIEKINASDRFIPEFLNLQTAV--------DNS-LKNQTYNTPAVATLLMLDNQVKWMNSNGGLDGMVARTTASSSALYNWAEAREE-ASPYVADAAKRSLVVGTIDFDDSIDAA-----------VIAKI-LRANGILDTEPYRKLGRNQLRIGMFPAIDSTDVEKLTGAIDFIL-------------------DGGFARK >Arthrobacter arilaitensis Re117 ------MSTDIK-----------------------------IPENLLPADGRFGAGPSKVRAE-QVQAIV-------DAGPELLGTSHRQAPVKNLVASVQDGLKEMFNAPA-GYEVLLGVGGSTAFWDAAAFSLVRS--KAQHLSFGEFGSKFAKATDKAPFLEASSIIVG-----EP-GTVPEPVA-------EADVDLYAWPHNETSTGAAAPIQRVA--GANADALVVIDATSAAGGLDV--DLAETDVYYFAPQKNFASDGGLWLAFVSPAAIARIEEIAATDRWIPDFLNLKTAL--------DNS-LKNQTYNTPSLTTLVGLDAQIKWINANGGLKWAAARTAESAGKIQAWAEASEI-AAPYVANPAHRSNVISTVDFADSVDAS-----------AIAKV-LRANGVVDVEPYRKLGRNQLRIATFVAIEPNDVESLLKCIDYVI-----------------------EQL >Propionibacterium acnes SK137 ------MPPNRVSLILPHDHSMNSAVIDDMICKDRIMAQPMIPRDLLPSDPRFGCGPSRIRRE-VVASLS-------E-PGSVMGTSHRQPPVRHVVAAIREELTELYNLPT-DYEVALGNGGATLFWDMATVSLVEK--RAATGVYGEFTRKFSSALQRAPFLADPAVFQA-----EP-GKLALPKA-------VSDVDTYAWAHNETSTGVVAPVRRPN--DIDDNSLVLVDATSAAGGVAA--DMSTIDAYYFSPQKNLSSDGGLWLAILSPAAIERSNRVTSSARWVPQMLDLSLAV--------TNS-RADQTLNTPALATLVMLEAQCRWLLDQGGMAWAASRTASTSGILYRWAEDNPL-TTPFVADPALRSPVVVTIDIDESVDAA-----------RLCAR-ARDNGILDIEPYRKLGRNQIRIATFSSIEPSDVEALTACLDWLL----------------------ENRD >Clostridium beijerinckii NCIMB8052 ------MSRV----------------------------------------YNFSAGPAVLPES-VLREAAGEMLDYKGTGMSVMEMSHRSKAFEEIITDAEKTLRELMNIPD-NYKVLFLQGGASQQFAMIPMNLMKNK-VVDHIITGQWAKKAASEAKI---FGKVNILASS-EDKTF-SYIPDLKD----LKVSEDADYVYICHNNTIYGTTYK--ELPNV---GDKILVADMSSDFLSEPV--DVSKYGLIFAGVQKNAGP-AGVVVVIIREDLITED--V---LPGTPTMLRYKVHA--------DN----KSLYNTPPAYGIYICGKVFKWVKNKGGLEAMKKINEEKASILYDFLDSSS-MFKGTV-VKKDRSLMNVPFVTGSDELDA-----------KFVKE-AKAVGFENLKGHRTVGG--MRASIYNAMPIEGVKDLVEFMRK----------------------FEEDNK >Flavobacterium johnsoniae UW101 ------MKK-----------------------------------------HNYSAGPSILPQE-VFEKASKAVLNFNDSGLSILEISHRSKDFVAVMDEARSLALELLGLQGKGYQALFLQGGASTAFLMAPYNLMKENGKAAYLDSGTWATAAIKEAKL---FGETVIVGSS-KDDNY-TYIPKGYE----I-PA-DADYFHCTSNNTIFGTQIQ--EFP----STNIPVVCDMSSDIFSREL--DFSKFDLIYAGAQKNMGP-AGTTLVVVKEEILGKN------GRTIPSMLDYAKHI--------KA----ESMYNTPSVFAVYVSLLTLQWIKAKGGIAAVEKLNNAKADLLYAEIDRNP-LFKGAA-NVEDRSKMNVTFLLNNPEHTE-----------TFDAL-WKAAGISGLPGHRSVGG--YRASIYNAMPIESVQVLVDVMKA-----------------------LESKV >Bacillus subtilis subsp. subtilis str. 168 ------MERT----------------------------------------TNFNAGPAALPLE-VLQKAQKEFIDFNESGMSVMELSHRSKEYEAVHQKAKSLLIELMGIPE-DYDILFLQGGASLQFSMLPMNFLTPEKTAHFVMTGAWSEKALAETKL---FGNTSITATS-ETDNY-SYIPEVDL----T-DVKDGAYLHITSNNTIFGTQWQ--EFP----NSPIPLVADMSSDILSRKI--DVSKFDVIYGGAQKNLGP-SGVTVVIMKKSWLQNE------NANVPKILKYSTHV--------KA----DSLYNTPPTFAIYMLSLVLEWLKENGGVEAVEQRNEQKAQVLYSCIDESNGFYKGHA-RKDSRSRMNVTFTLRDDELTK-----------TFVQK-AKDAKMIGLGGHRSVGG--CRASIYNAVSLEDCEKLAAFMKK----------------------FQQENE >Escherichia coli str. K-12 substr. MG1655 ------MAQI----------------------------------------FNFSSGPAMLPAE-VLKQAQQELRDWNGLGTSVMEVSHRGKEFIQVAEEAEKDFRDLLNVPS-NYKVLFCHGGGRGQFAAVPLNILGDKTTADYVDAGYWAASAIKEAKK---YCTPNVFDAKVTVDGLRAVKPMREW----Q-LSDNAAYMHYCPNETIDGIAID--ETPDF--GADVVVAADFSSTILSRPI--DVSRYGVIYAGAQKNIGP-AGLTIVIVREDLLGKA------NIACPSILDYSILN--------DN----GSMFNTPPTFAWYLSGLVFKWLKANGGVAEMDKINQQKAELLYGVIDNSD-FYRNDV-AKANRSRMNVPFQLADSALDK-----------LFLEE-SFAAGLHALKGHRVVGG--MRASIYNAMPLEGVKALTDFMVE----------------------FERRHG >Archaeoglobus fulgidus DSM 4304 ------MLL-------------------------------------------MIPGPVQLHER-IIRAMA------------RQMIGHRTADFSAIMEFCVEKLREIFGTKG-DIC--LISGSGTAGMEAAIASFSRV-KKIATLENGKFGERLGDIAERYT---QVERVKV-----PW-GESFELDAVKEAL--DNGCEAVAFVHNETSTGILNPAKEIAKLAKEYDALVIMDAITSAGGDYVKMDEWGVDVAIVGSQKCLGAPPGLAAVAVSEKAWDYYNER------CPYYLDLAAYR--------KKL-KDMQTPYTPAVPLFFALAEALKIIDEE-GLENRIQRHRILSAAVRKWAVEAGLELFPNLNKYSSYSNTVTAIKMPEGVSDS-----------ELRGTLKKEYGILVSGGQGELKGKIFRIGTMGNVGKFETVSTLAALEDVLMRK---NA-IKPALQY-AQILLRDLQ >Methanococcus maripaludis strain S2 RDLMKQMDTE---------------------------------------KLLMIPGPTMVPSR-VLNTMA------------LPIIGHRTSDFGDLTGDTVDMMKKVFQTEN-DTY--IITGSGTAVMDMAISNTLDKGDKVINITNGNFGERFYKISSVYK--ADTIKYEP-----EW-GDLADPKKLRELLEENEGIKAVTVVHNETSTGAKNPIEDLGNVVKDFDAIYIVDTISSLGGDYVNVDKFNIDICVTGSQKCIAAPPGLAAITVGEKAWDVVSKT-ET---KSFYLDLNAYK--------KSWDAKKETPYTPSVSLTYAMNEALEMVLEE-GLENRFKRHDLLARATRAGLEAMGL---ELFAKERARSVTVTSAKYPEGIDDK-----------KFRGLLAEKYNIRVAGGQSHLAGKIFRVGHMGSAKEYQVLGTLAAIELAFKEL----G-YNAEGGVA-AAKKVLSN >Ammonifex degensii KC4 ------MKKE---------------------------------------TRLFIPGPTPVPPA-VAEAMA------------RPLIGHRTEDFARLYARLEERLRVVLGTKN-DIV--ILTSSGTGGMEAAVANLVSPGDPVLALVTGKFGERFAELAKVYG--GAVEVMEF-----GW-GKAVDLEAVEEKL-KARRFKVVLATHNETSTTVVNDIRGLGELTRRYGALLVVDAVSSAGGMEIRMDDWGVDVLVTASQKALMVPPGLAIVAASDAAWKAMEEN-KN---PRYYLDLLAAR--------KSK-QKYNTPYTPAVSLFVGLDRALDLILAE-GLEKVYRKHRLLARAVRAAIRALGLKLM---IPDEYASPVVTGVWAPEGIEVD-----------RLRKEIASRYGVLLAGGQGPLKGKIFRISHMGYVDAVDILGALGALELGLYRFGFKFKLGEGLAQAQAVLAEEGEE Each of my 4 original sequences defines a “group”, so I have shown an alignment of 3-4 sequences from each “group”. The gray columns indicate the only places where all sequences are identical.

  6. >Agrobacterium tumefaciens C58 ------MTDIMK------------------------------P-DLRPGNTHFSSGPCSKRPGWSLDAL----------SDAPLGRSHRAKVGKAKLKQAIDLTREILNVPA-DYRIGIVPASDTGAVEMALWSLLGE-RGVDMLAWESFGAGWVTDVVKQLKLKDVRKFEA-----DY-GLLPNLAE-------VDFDRDVVFTWNGTTSGVRVANADFI--PADRKGLTICDATSAAFAQDM--DFTKLDVVTFSWQKVLGGEGGHGVIILSPRAVERLLSYSP-AWPLPKIFRMVSGGK-----LIEGI-FTGETINTPSMLCVEDYIDALLWAKNLGGLKALIGRADANAKVIYDFIEKNNW-IANLAVKPETRSNTSVCLKIVDPEVQALDAAAQADFAKGIVALLEKENVALDIGAYRDA-PSGLRIWAGATIETADMEAVMPWLAWAYQTQ-------------K-AALSKAAA >Bradyrhizobium japonicum USDA110 ------MTVA-K------------------------------P-ASRPNVPHFSSGPCAKRPGWNAQNL----------KDAALGRSHRAKVGKTKLKLAIDLTREVLEVPA-DYRIGIVPASDTGAVEMALWSLLGA-RPVTTLAWESFGEGWVSDIVKELKLKDVTKLNA-----AY-GEIPDLSK-------VDPKSDVVFTWNGTTSGVRVPNADWI--SATREGLTICDATSAAFAQAL--DWAKLDVVTFSWQKALGGEAAHGMLILSPRAVERLETYKP-AWPLPKIFRMTKGGK-----INEGI-FVGETINTPSMLCVEDYLDALNWAKSIGGLKALIARADANTKVLADWKAKTPW-IDFLAKDASIRSNTSVCLKFIDPALTALSDDAQAEFSKKLVALVEKEGAGYDFAYYRDA-PAGLRIWCGATVEARDVELLTQWIDWAFAET-------------K-AQ-LAKAA >Caulobacter sp. K31 ------MTTLAK------------------------------P-AQRPARPEFSSGPCAKRPGWTPENL----------RNAVLGRSHRSKLGKARLKAAIDQTRDVLEVPA-DFLIGIVPGSDTGAVEMAMWSMLGQ-RPVQLLAFESFGKDWVTDVTKQLKLPNVEVLDA-----PY-GQLPDTSK-------VDPAKDLVFTWNGTTSGVRVPNADFI--SADREGIVICDATSAAFAQDL--DWTKLDVVTFSWQKALGGEGAHGVLILSPRAVARLESYTP-AWPMPKLFRMTKANKDGGNKVALDI-FEGATINTPSMLCVEDALDALKWAASIGGLEAMQGRADQNLAVLADWVARTPW-VEFLAATPEIRSNTSVCLKVVDPAIAALSDDAQADFAKKLASLLEKEGAALDIGGYRDA-PAGLRIWCGATVEASDVEALTPWLDWAFATV-------------S---AELAAA >Mycobacterium tuberculosis H37Rv ------MADQLT-------------------------PHLEIPTAIKPRDGRFGSGPSKVRLE-QLQTLT-------TTAAALFGTSHRQAPVKNLVGRVRSGLAELFSLPD-GYEVILGNGGATAFWDAAAFGLIDK--RSLHLTYGEFSAKFASAVSKNPFVGEPIIITS-----DP-GSAPEPQT-------DPSVDVIAWAHNETSTGVAVAVRRPE--G-SDDALVVIDATSGAGGLPV--DIAETDAYYFAPQKNFASDGGLWLAIMSPAALSRIEAIAATGRWVPDFLSLPIAV--------ENS-LKNQTYNTPAIATLALLAEQIDWLVGNGGLDWAVKRTADSSQRLYSWAQERPY-TTPFVTDPGLRSQVVGTIDFVDDVDAG-----------TVAKI-LRANGIVDTEPYRKLGRNQLRVAMFPAVEPDDVSALTECVDWVV-----------------------ERL >Corynebacterium glutamicum ATCC13032 ------MTDFPT-----------------------------LPSEFIPGDGRFGCGPSKVRPE-QIQAIV-------DGSASVIGTSHRQPAVKNVVGSIREGLSDLFSLPE-GYEIILSLGGATAFWDAATFGLIEK--KSGHLSFGEFSSKFAKASKLAPWLDEPEIVTA-----ET-GDSPAPQA-------FEGADVIAWAHNETSTGAMVPVLRPE--G-SEGSLVAIDATSGAGGLPV--DIKNSDVYYFSPQKCFASDGGLWLAAMSPAALERIEKINASDRFIPEFLNLQTAV--------DNS-LKNQTYNTPAVATLLMLDNQVKWMNSNGGLDGMVARTTASSSALYNWAEAREE-ASPYVADAAKRSLVVGTIDFDDSIDAA-----------VIAKI-LRANGILDTEPYRKLGRNQLRIGMFPAIDSTDVEKLTGAIDFIL-------------------DGGFARK >Arthrobacter arilaitensis Re117 ------MSTDIK-----------------------------IPENLLPADGRFGAGPSKVRAE-QVQAIV-------DAGPELLGTSHRQAPVKNLVASVQDGLKEMFNAPA-GYEVLLGVGGSTAFWDAAAFSLVRS--KAQHLSFGEFGSKFAKATDKAPFLEASSIIVG-----EP-GTVPEPVA-------EADVDLYAWPHNETSTGAAAPIQRVA--GANADALVVIDATSAAGGLDV--DLAETDVYYFAPQKNFASDGGLWLAFVSPAAIARIEEIAATDRWIPDFLNLKTAL--------DNS-LKNQTYNTPSLTTLVGLDAQIKWINANGGLKWAAARTAESAGKIQAWAEASEI-AAPYVANPAHRSNVISTVDFADSVDAS-----------AIAKV-LRANGVVDVEPYRKLGRNQLRIATFVAIEPNDVESLLKCIDYVI-----------------------EQL >Propionibacterium acnes SK137 ------MPPNRVSLILPHDHSMNSAVIDDMICKDRIMAQPMIPRDLLPSDPRFGCGPSRIRRE-VVASLS-------E-PGSVMGTSHRQPPVRHVVAAIREELTELYNLPT-DYEVALGNGGATLFWDMATVSLVEK--RAATGVYGEFTRKFSSALQRAPFLADPAVFQA-----EP-GKLALPKA-------VSDVDTYAWAHNETSTGVVAPVRRPN--DIDDNSLVLVDATSAAGGVAA--DMSTIDAYYFSPQKNLSSDGGLWLAILSPAAIERSNRVTSSARWVPQMLDLSLAV--------TNS-RADQTLNTPALATLVMLEAQCRWLLDQGGMAWAASRTASTSGILYRWAEDNPL-TTPFVADPALRSPVVVTIDIDESVDAA-----------RLCAR-ARDNGILDIEPYRKLGRNQIRIATFSSIEPSDVEALTACLDWLL----------------------ENRD >Clostridium beijerinckii NCIMB8052 ------MSRV----------------------------------------YNFSAGPAVLPES-VLREAAGEMLDYKGTGMSVMEMSHRSKAFEEIITDAEKTLRELMNIPD-NYKVLFLQGGASQQFAMIPMNLMKNK-VVDHIITGQWAKKAASEAKI---FGKVNILASS-EDKTF-SYIPDLKD----LKVSEDADYVYICHNNTIYGTTYK--ELPNV---GDKILVADMSSDFLSEPV--DVSKYGLIFAGVQKNAGP-AGVVVVIIREDLITED--V---LPGTPTMLRYKVHA--------DN----KSLYNTPPAYGIYICGKVFKWVKNKGGLEAMKKINEEKASILYDFLDSSS-MFKGTV-VKKDRSLMNVPFVTGSDELDA-----------KFVKE-AKAVGFENLKGHRTVGG--MRASIYNAMPIEGVKDLVEFMRK----------------------FEEDNK >Flavobacterium johnsoniae UW101 ------MKK-----------------------------------------HNYSAGPSILPQE-VFEKASKAVLNFNDSGLSILEISHRSKDFVAVMDEARSLALELLGLQGKGYQALFLQGGASTAFLMAPYNLMKENGKAAYLDSGTWATAAIKEAKL---FGETVIVGSS-KDDNY-TYIPKGYE----I-PA-DADYFHCTSNNTIFGTQIQ--EFP----STNIPVVCDMSSDIFSREL--DFSKFDLIYAGAQKNMGP-AGTTLVVVKEEILGKN------GRTIPSMLDYAKHI--------KA----ESMYNTPSVFAVYVSLLTLQWIKAKGGIAAVEKLNNAKADLLYAEIDRNP-LFKGAA-NVEDRSKMNVTFLLNNPEHTE-----------TFDAL-WKAAGISGLPGHRSVGG--YRASIYNAMPIESVQVLVDVMKA-----------------------LESKV >Bacillus subtilis subsp. subtilis str. 168 ------MERT----------------------------------------TNFNAGPAALPLE-VLQKAQKEFIDFNESGMSVMELSHRSKEYEAVHQKAKSLLIELMGIPE-DYDILFLQGGASLQFSMLPMNFLTPEKTAHFVMTGAWSEKALAETKL---FGNTSITATS-ETDNY-SYIPEVDL----T-DVKDGAYLHITSNNTIFGTQWQ--EFP----NSPIPLVADMSSDILSRKI--DVSKFDVIYGGAQKNLGP-SGVTVVIMKKSWLQNE------NANVPKILKYSTHV--------KA----DSLYNTPPTFAIYMLSLVLEWLKENGGVEAVEQRNEQKAQVLYSCIDESNGFYKGHA-RKDSRSRMNVTFTLRDDELTK-----------TFVQK-AKDAKMIGLGGHRSVGG--CRASIYNAVSLEDCEKLAAFMKK----------------------FQQENE >Escherichia coli str. K-12 substr. MG1655 ------MAQI----------------------------------------FNFSSGPAMLPAE-VLKQAQQELRDWNGLGTSVMEVSHRGKEFIQVAEEAEKDFRDLLNVPS-NYKVLFCHGGGRGQFAAVPLNILGDKTTADYVDAGYWAASAIKEAKK---YCTPNVFDAKVTVDGLRAVKPMREW----Q-LSDNAAYMHYCPNETIDGIAID--ETPDF--GADVVVAADFSSTILSRPI--DVSRYGVIYAGAQKNIGP-AGLTIVIVREDLLGKA------NIACPSILDYSILN--------DN----GSMFNTPPTFAWYLSGLVFKWLKANGGVAEMDKINQQKAELLYGVIDNSD-FYRNDV-AKANRSRMNVPFQLADSALDK-----------LFLEE-SFAAGLHALKGHRVVGG--MRASIYNAMPLEGVKALTDFMVE----------------------FERRHG >Archaeoglobus fulgidus DSM 4304 ------MLL-------------------------------------------MIPGPVQLHER-IIRAMA------------RQMIGHRTADFSAIMEFCVEKLREIFGTKG-DIC--LISGSGTAGMEAAIASFSRV-KKIATLENGKFGERLGDIAERYT---QVERVKV-----PW-GESFELDAVKEAL--DNGCEAVAFVHNETSTGILNPAKEIAKLAKEYDALVIMDAITSAGGDYVKMDEWGVDVAIVGSQKCLGAPPGLAAVAVSEKAWDYYNER------CPYYLDLAAYR--------KKL-KDMQTPYTPAVPLFFALAEALKIIDEE-GLENRIQRHRILSAAVRKWAVEAGLELFPNLNKYSSYSNTVTAIKMPEGVSDS-----------ELRGTLKKEYGILVSGGQGELKGKIFRIGTMGNVGKFETVSTLAALEDVLMRK---NA-IKPALQY-AQILLRDLQ >Methanococcus maripaludis strain S2 RDLMKQMDTE---------------------------------------KLLMIPGPTMVPSR-VLNTMA------------LPIIGHRTSDFGDLTGDTVDMMKKVFQTEN-DTY--IITGSGTAVMDMAISNTLDKGDKVINITNGNFGERFYKISSVYK--ADTIKYEP-----EW-GDLADPKKLRELLEENEGIKAVTVVHNETSTGAKNPIEDLGNVVKDFDAIYIVDTISSLGGDYVNVDKFNIDICVTGSQKCIAAPPGLAAITVGEKAWDVVSKT-ET---KSFYLDLNAYK--------KSWDAKKETPYTPSVSLTYAMNEALEMVLEE-GLENRFKRHDLLARATRAGLEAMGL---ELFAKERARSVTVTSAKYPEGIDDK-----------KFRGLLAEKYNIRVAGGQSHLAGKIFRVGHMGSAKEYQVLGTLAAIELAFKEL----G-YNAEGGVA-AAKKVLSN >Ammonifex degensii KC4 ------MKKE---------------------------------------TRLFIPGPTPVPPA-VAEAMA------------RPLIGHRTEDFARLYARLEERLRVVLGTKN-DIV--ILTSSGTGGMEAAVANLVSPGDPVLALVTGKFGERFAELAKVYG--GAVEVMEF-----GW-GKAVDLEAVEEKL-KARRFKVVLATHNETSTTVVNDIRGLGELTRRYGALLVVDAVSSAGGMEIRMDDWGVDVLVTASQKALMVPPGLAIVAASDAAWKAMEEN-KN---PRYYLDLLAAR--------KSK-QKYNTPYTPAVSLFVGLDRALDLILAE-GLEKVYRKHRLLARAVRAAIRALGLKLM---IPDEYASPVVTGVWAPEGIEVD-----------RLRKEIASRYGVLLAGGQGPLKGKIFRISHMGYVDAVDILGALGALELGLYRFGFKFKLGEGLAQAQAVLAEEGEE Are these 4 groups of proteins all homologs? Lots of gaps needed to align them all and very few residues were conserved across all of them. What do you think? How about some more data. On the next slide is an analysis of where alpha-helices (indicated by H’s) and beta-sheets (indicated by E’s) may form in each of my original 4 protein sequences.

  7. Agrobacterium ------------------------HHHHHHHH-----------HHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEEE--HHHHHHHHHHHH----EEEEE------------------EEEEE------EE---HHHH------EEEEEE--------------EEEEE------------EEEEE--HHHHHHH---------HHHHHHH--HHHH------------HHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHH-----------------EEEEEEE-------HHHHHHHHHHHHHHHHHH----EEE---------EEEEEE-----HHHHHHHHHHHHHHHHHHHHHH----- Mycobacterium ------------------------------HHHHHHHHH------------HHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEE---HHHHHHHHHHH----EEEEEE----------------EEEEEE----EEEEEEHHH-------EEEEEE--------------EEEEEE-----------EEEEE-HHHHHHHH--------------------------------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHH----------------EEEEEEE-----HHHHHHHHHH---EEE---------EEEEEE-----HHHHHHHHHHHHHHHH— Escherichia ---------------HHHHHHHHH------------------HHHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEEEE--HHHHHHHHHHHH---EEEEEE-----------HHHHH------EEEEEE-----EEE-----------EEEEEE--------------EEEEEE----------EEEEE-HHHHHH----------HHHHH----------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHH---------------EEEEEEE-----HHHHHHHHHH--------------EEEEE-----HHHHHHHHHHHHHHHHH— Methanococcus ----------------------HHHHHHH---------HHHHHHHHHHHHHHHHHH-----EEEEE---HHHHHHHHHHH------EEEEE------HHHHHHHH---EEEEEE---------HHHHHH-------EEEEEEE----EEEEE-HHHHHHHHH----EEEEEHHHH-------------EEEEE-----------EEEEEE----HHH-------------------------------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHH-----EEE---------EEEEE------HHHHHHHHHH---EEEE---------EEEEE------HHHHHHHHHHHHHHHHHH----HHHHHHHHHHH-- Agrobacterium ------------------------HHHHHHHH-----------HHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEEE--HHHHHHHHHHHH----EEEEE------------------EEEEE------EE---HHHH------EEEEEE--------------EEEEE------------EEEEE--HHHHHHH---------HHHHHHH--HHHH------------HHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHH-----------------EEEEEEE-------HHHHHHHHHHHHHHHHHH----EEE---------EEEEEE-----HHHHHHHHHHHHHHHHHHHHHH----- Mycobacterium ------------------------------HHHHHHHHH------------HHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEE---HHHHHHHHHHH----EEEEEE----------------EEEEEE----EEEEEEHHH-------EEEEEE--------------EEEEEE-----------EEEEE-HHHHHHHH--------------------------------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHH----------------EEEEEEE-----HHHHHHHHHH---EEE---------EEEEEE-----HHHHHHHHHHHHHHHH— Escherichia ---------------HHHHHHHHH------------------HHHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEEEE--HHHHHHHHHHHH---EEEEEE-----------HHHHH------EEEEEE-----EEE-----------EEEEEE--------------EEEEEE----------EEEEE-HHHHHH----------HHHHH----------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHH---------------EEEEEEE-----HHHHHHHHHH--------------EEEEE-----HHHHHHHHHHHHHHHHH— Methanococcus ----------------------HHHHHHH---------HHHHHHHHHHHHHHHHHH-----EEEEE---HHHHHHHHHHH------EEEEE------HHHHHHHH---EEEEEE---------HHHHHH-------EEEEEEE----EEEEE-HHHHHHHHH----EEEEEHHHH-------------EEEEE-----------EEEEEE----HHH-------------------------------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHH-----EEE---------EEEEE------HHHHHHHHHH---EEEE---------EEEEE------HHHHHHHHHHHHHHHHHH----HHHHHHHHHHH-- Agrobacterium ------------------------HHHHHHHH-----------HHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEEE--HHHHHHHHHHHH----EEEEE------------------EEEEE------EE---HHHH------EEEEEE--------------EEEEE------------EEEEE--HHHHHHH---------HHHHHHH--HHHH------------HHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHH-----------------EEEEEEE-------HHHHHHHHHHHHHHHHHH----EEE---------EEEEEE-----HHHHHHHHHHHHHHHHHHHHHH----- Mycobacterium ------------------------------HHHHHHHHH------------HHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEE---HHHHHHHHHHH----EEEEEE----------------EEEEEE----EEEEEEHHH-------EEEEEE--------------EEEEEE-----------EEEEE-HHHHHHHH--------------------------------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHH----------------EEEEEEE-----HHHHHHHHHH---EEE---------EEEEEE-----HHHHHHHHHHHHHHHH— Escherichia ---------------HHHHHHHHH------------------HHHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEEEE--HHHHHHHHHHHH---EEEEEE-----------HHHHH------EEEEEE-----EEE-----------EEEEEE--------------EEEEEE----------EEEEE-HHHHHH----------HHHHH----------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHH---------------EEEEEEE-----HHHHHHHHHH--------------EEEEE-----HHHHHHHHHHHHHHHHH— Methanococcus ----------------------HHHHHHH---------HHHHHHHHHHHHHHHHHH-----EEEEE---HHHHHHHHHHH------EEEEE------HHHHHHHH---EEEEEE---------HHHHHH-------EEEEEEE----EEEEE-HHHHHHHHH----EEEEEHHHH-------------EEEEE-----------EEEEEE----HHH-------------------------------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHH-----EEE---------EEEEE------HHHHHHHHHH---EEEE---------EEEEE------HHHHHHHHHHHHHHHHHH----HHHHHHHHHHH-- Agrobacterium ------------------------HHHHHHHH-----------HHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEEE--HHHHHHHHHHHH----EEEEE------------------EEEEE------EE---HHHH------EEEEEE--------------EEEEE------------EEEEE--HHHHHHH---------HHHHHHH--HHHH------------HHHHHHHHHHHHHHHHH--HHHHHHHHHHHHHHHHHHHH-----------------EEEEEEE-------HHHHHHHHHHHHHHHHHH----EEE---------EEEEEE-----HHHHHHHHHHHHHHHHHHHHHH----- Mycobacterium ------------------------------HHHHHHHHH------------HHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEE---HHHHHHHHHHH----EEEEEE----------------EEEEEE----EEEEEEHHH-------EEEEEE--------------EEEEEE-----------EEEEE-HHHHHHHH--------------------------------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHH----------------EEEEEEE-----HHHHHHHHHH---EEE---------EEEEEE-----HHHHHHHHHHHHHHHH— Escherichia ---------------HHHHHHHHH------------------HHHHHHHHHHHHHHHHHHH------EEEEEE---HHHHHHHHHH------EEEEEEE--HHHHHHHHHHHH---EEEEEE-----------HHHHH------EEEEEE-----EEE-----------EEEEEE--------------EEEEEE----------EEEEE-HHHHHH----------HHHHH----------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHHHHH---------------EEEEEEE-----HHHHHHHHHH--------------EEEEE-----HHHHHHHHHHHHHHHHH— Methanococcus ----------------------HHHHHHH---------HHHHHHHHHHHHHHHHHH-----EEEEE---HHHHHHHHHHH------EEEEE------HHHHHHHH---EEEEEE---------HHHHHH-------EEEEEEE----EEEEE-HHHHHHHHH----EEEEEHHHH-------------EEEEE-----------EEEEEE----HHH-------------------------------HHHHHHHHHHHHHHHH---HHHHHHHHHHHHHHHHHH-----EEE---------EEEEE------HHHHHHHHHH---EEEE---------EEEEE------HHHHHHHHHHHHHHHHHH----HHHHHHHHHHH-- Feeling any more confident about drawing a conclusion? How about one last piece of data. Luckily, X-ray crystallography data is available for 1 sequence from the blue group (Mycobacterium) and for 2 sequences from the green group (E. coli, Bacillus). All 3 of these proteins form homodimers. Ribbon diagrams are shown on the next page.

  8. Mycobacterium tuberculosis SerC on top Bacillus circulans SerC to the left. E. coli SerC to the right. What do we finally conclude? What does this example tell us about the limits of primary sequence similarity as a guide to homology? This example illustrates many of the bioinformatics tools that we will use during the institute.

More Related