gpdat_1.seq Genetic Sequence Data Bank 04-23-2012 GenPept Release 189.0 Translated Protein-coding Sequences (Part 1) 3794625 loci containing 1188821205 residues =========================================================== gpdat_2.seq Genetic Sequence Data Bank 04-23-2012 GenPept Release 189.0 Translated Protein-coding Sequences (Part 2) 3790846 loci containing 1118924042 residues =========================================================== ggpdat_3.seq Genetic Sequence Data Bank 04-23-2012 GenPept Release 189.0 Translated Protein-coding Sequences (Part 3) 3767161 loci containing 1157163892 residues =========================================================== gpdat_4.seq Genetic Sequence Data Bank 04-23-2012 GenPept Release 189.0 Translated Protein-coding Sequences (Part 4) 2276358 loci containing 660542060 residues =========================================================== TOTAL 13628990 loci containing 4125451199 residues Table of Contents 1. INTRODUCTION 1.1 Release 189.0 1.2 Organization of This Document 1.3 Important Changes in Release 189.0 1.4 Recent Changes in the Data Bank 1.5 Upcoming Changes 2. ORGANIZATION OF FILES 2.1 File Descriptions 2.2 Entries by division 3. FILE FORMAT 3.1 File Header Information 3.2 Sequence Entry Files 3.2.1 Entry Organization 3.2.2 Sample Sequence Data File 4 TRADEMARKS, CITATIONS, ETC. 4.1 Registered Trademark Notices 4.2 Citing GenPept 4.3 GenPept Distribution Format 4.4 Disclaimer APPENDIX A - IUPAC-IUB AMINO ACID CODES List Of Examples and Tables Example 1. Sample File Header Example 2. Sample Sequence Data File This document describes the GenPept data bank available via anonymous FTP from the Advanced Biomedical Computing Center (ftp.ncifcrf.gov). GenPept is produced by parsing the corresponding GenBank release for translated coding regions of GenBank as defined in the FEATURES section of each sequence. If you have any questions or comments about the data bank or this document, please contact: Gary Smythers smytherg@mail.nih.gov #----------------------------- # Gary W. Smythers [Contractor] # Bioinformatics Analyst IV # Advanced Biomedical Computing Center # SAIC NCI-Frederick # National Cancer Institute at Frederick # Post Office Box B # Frederick, MD 21702-1201 USA # Phone: 301-846-5778 # FAX: 301-846-5762 # smytherg@mail.nih.gov #----------------------------- 1. INTRODUCTION 1.1 Release 189.0 GenPept Release 189.0 includes the translations of all protein coding regions in GenBank Release 189.0. GenPept Release 189.0 includes 13,628,990 loci representing 4,125,451,199 residues. Supplemental files of daily updates, both cumulative and non-cumulative are also available. 1.2 Organization of This Document This introduction notes changes to the GenPept data bank since the last release. The next section describes the contents of the files. The third section illustrates the formats of the files. 1.3 Important Changes in Release 189.0 NONE 1.4 Recent Changes in the Data Bank NONE 1.5 Upcoming Changes NONE 2. ORGANIZATION OF FILES 2.1 File Descriptions The GenPept release includes the following files: /pub/genpept/gprel.txt.gz - Release Notes (this document). /pub/genpept/gpdat_1.seq.gz - GenPept entries Part 1. /pub/genpept/gpdat_2.seq.gz - GenPept entries Part 2. /pub/genpept/gpdat_3.seq.gz - GenPept entries Part 3. /pub/genpept/gpdat_4.seq.gz - GenPept entries Part 4. /pub/genpept/gpdat.fasta.gz - All GenPept entries (fasta format). Individual division files in /pub/genpept/divisions gpbct1.seq.gz ... gpbct85.seq.gz - Bacterial sequences. gpenv1.seq.gz ... gpenv53.seq.gz - Environmental gpest1.seq.gz ... gpest461.seq.g - Expressed sequence tags. gpgss1.seq.gz ... gpgss255.seq.gz - Genome Survey Sequence. gphtc1.seq.gz ... gphtc15.seq.gz - High Throughput cDNA. gphtg1.seq.gz ... gphtg136.seq.gz - High Throughput Genome. gpinv1.seq.gz ... gpinv30.seq.gz - Invertebrate sequences. gpmam1.seq.gz ... gpmam8.seq.gz - Other mammalian sequences. gppat1.seq.gz ... gppat178.seq.gz - Patent sequences. gpphg1.seq.gz - Phage sequences. gppln1.seq.gz ... gppln55.seq.gz - Plant sequences. gppri1.seq.gz ... gppri45.seq.gz - Primate sequences. gprod1.seq.gz ... gprod29.seq.gz - Rodent sequences. gpsts1.seq.gz ... gpsts20.seq.gz - STS sequences. gpsyn1.seq.gz ... gpsyn7.seq.gz - Synthetic and chimeric sequences. gptsa1.seq.gz ... gptsa70.seq.gz - Transcript Shotgun Assembly.. gpuna1.seq.gz - Unannotated sequences. gpvrl1.seq.gz ... gpvrl20.seq.gz - Viral sequences. gpvrt1.seq.gz ... gpvrt26.seq.gz - Other vertebrate sequences. /pub/genpept/updates/gpseq_updates.dat.gz - Daily cumulative updates. /pub/genpept/updates/gpncMMDD.seq.gz - Daily non-cumulative updates. 2.2 Entries by division: Filename Loci Residues gpbct1 54877 15749170 gpbct10 105798 32949143 gpbct11 104586 33321669 gpbct12 101609 31887878 gpbct13 101588 32262585 gpbct14 7724 1979349 gpbct15 68603 18649012 gpbct16 103162 31999608 gpbct17 83495 26561147 gpbct18 101713 31939520 gpbct19 98752 32703543 gpbct2 95536 28926473 gpbct20 106239 33087760 gpbct21 100119 32288705 gpbct22 100685 31593414 gpbct23 83081 27947490 gpbct24 96410 31002396 gpbct25 100694 30431819 gpbct26 101825 32213812 gpbct27 97786 31178836 gpbct28 99326 31004044 gpbct29 100347 31157176 gpbct3 100121 31552264 gpbct30 99831 30806871 gpbct31 99751 31423685 gpbct32 98372 30882552 gpbct33 101196 31703522 gpbct34 104259 31243013 gpbct35 97721 30073035 gpbct36 63331 19793323 gpbct37 103265 30594376 gpbct38 97671 31363746 gpbct39 94837 31094688 gpbct4 99983 30321805 gpbct40 93758 30268952 gpbct41 96313 30855188 gpbct42 96992 32228321 gpbct43 101519 31167371 gpbct44 103094 31859418 gpbct45 94141 30465516 gpbct46 99038 30607309 gpbct47 97950 30699828 gpbct48 67472 21341683 gpbct49 98189 31440428 gpbct5 60179 17546896 gpbct50 94043 30774999 gpbct51 95501 29599203 gpbct52 97329 31129691 gpbct53 96204 31093389 gpbct54 103131 31958214 gpbct55 94986 30927741 gpbct56 99203 31262238 gpbct57 103202 32945104 gpbct58 103941 31783022 gpbct59 98200 31003347 gpbct6 64416 18307368 gpbct60 83811 25968375 gpbct61 100802 31954925 gpbct62 103012 31884778 gpbct63 104406 32941706 gpbct64 88166 28208530 gpbct65 50378 15949302 gpbct66 1483 459204 gpbct67 3369 1019669 gpbct68 5173 1447126 gpbct69 9384 2787376 gpbct7 85104 26812570 gpbct70 20856 5040988 gpbct71 39084 9715837 gpbct72 53824 13352297 gpbct73 53911 14600523 gpbct74 80457 24675535 gpbct75 84008 25620008 gpbct76 83264 26100070 gpbct77 93290 28033612 gpbct78 89748 27611415 gpbct79 87569 26051077 gpbct8 80414 24724612 gpbct80 21362 4673297 gpbct81 43483 11951761 gpbct82 66071 19449426 gpbct83 46150 11446092 gpbct84 50984 13654496 gpbct85 28469 8365284 gpbct9 86527 26542489 gpenv1 17040 3008660 gpenv10 6845 1227997 gpenv11 3652 775821 gpenv12 6643 1183438 gpenv13 3298 633709 gpenv14 5305 1034231 gpenv15 4915 863908 gpenv16 6995 1156621 gpenv17 14707 3617001 gpenv18 8060 1854548 gpenv19 0 0 gpenv2 16760 3481093 gpenv20 254 21896 gpenv21 9772 1774880 gpenv22 1271 231586 gpenv23 0 0 gpenv24 0 0 gpenv25 2797 613834 gpenv26 13086 2650077 gpenv27 3459 807424 gpenv28 1423 233055 gpenv29 9500 1831490 gpenv3 25248 6349315 gpenv30 2272 406614 gpenv31 0 0 gpenv32 3515 719382 gpenv33 1521 256485 gpenv34 2516 381044 gpenv35 384 72821 gpenv36 1421 228584 gpenv37 8999 1519836 gpenv38 7975 1405162 gpenv39 0 0 gpenv4 14972 3214852 gpenv40 2360 468448 gpenv41 0 0 gpenv42 994 179696 gpenv43 13791 2429776 gpenv44 0 0 gpenv45 8260 1555557 gpenv46 3769 910408 gpenv47 20956 1641881 gpenv48 0 0 gpenv49 0 0 gpenv5 11121 1925996 gpenv50 5489 1046794 gpenv51 2577 499903 gpenv52 2344 392483 gpenv53 2771 591507 gpenv6 6719 1351547 gpenv7 1454 261978 gpenv8 11818 2556283 gpenv9 9671 1882285 gpest1 0 0 gpest10 0 0 gpest100 0 0 gpest101 0 0 gpest102 0 0 gpest103 0 0 gpest104 0 0 gpest105 0 0 gpest106 0 0 gpest107 0 0 gpest108 0 0 gpest109 0 0 gpest11 0 0 gpest110 0 0 gpest111 0 0 gpest112 0 0 gpest113 0 0 gpest114 0 0 gpest115 0 0 gpest116 0 0 gpest117 0 0 gpest118 0 0 gpest119 0 0 gpest12 0 0 gpest120 0 0 gpest121 0 0 gpest122 0 0 gpest123 0 0 gpest124 0 0 gpest125 0 0 gpest126 0 0 gpest127 0 0 gpest128 0 0 gpest129 0 0 gpest13 0 0 gpest130 0 0 gpest131 0 0 gpest132 0 0 gpest133 0 0 gpest134 0 0 gpest135 0 0 gpest136 0 0 gpest137 0 0 gpest138 0 0 gpest139 0 0 gpest14 0 0 gpest140 0 0 gpest141 0 0 gpest142 0 0 gpest143 0 0 gpest144 0 0 gpest145 0 0 gpest146 0 0 gpest147 0 0 gpest148 0 0 gpest149 0 0 gpest15 0 0 gpest150 0 0 gpest151 0 0 gpest152 0 0 gpest153 0 0 gpest154 0 0 gpest155 0 0 gpest156 0 0 gpest157 0 0 gpest158 0 0 gpest159 0 0 gpest16 0 0 gpest160 0 0 gpest161 0 0 gpest162 0 0 gpest163 0 0 gpest164 0 0 gpest165 0 0 gpest166 0 0 gpest167 0 0 gpest168 0 0 gpest169 0 0 gpest17 0 0 gpest170 0 0 gpest171 0 0 gpest172 0 0 gpest173 0 0 gpest174 0 0 gpest175 0 0 gpest176 0 0 gpest177 0 0 gpest178 0 0 gpest179 0 0 gpest18 0 0 gpest180 0 0 gpest181 0 0 gpest182 0 0 gpest183 0 0 gpest184 0 0 gpest185 0 0 gpest186 0 0 gpest187 0 0 gpest188 0 0 gpest189 0 0 gpest19 0 0 gpest190 0 0 gpest191 0 0 gpest192 0 0 gpest193 0 0 gpest194 0 0 gpest195 0 0 gpest196 0 0 gpest197 0 0 gpest198 0 0 gpest199 0 0 gpest2 0 0 gpest20 0 0 gpest200 0 0 gpest201 0 0 gpest202 0 0 gpest203 0 0 gpest204 0 0 gpest205 0 0 gpest206 0 0 gpest207 0 0 gpest208 0 0 gpest209 0 0 gpest21 0 0 gpest210 0 0 gpest211 0 0 gpest212 0 0 gpest213 0 0 gpest214 0 0 gpest215 0 0 gpest216 0 0 gpest217 0 0 gpest218 0 0 gpest219 0 0 gpest22 0 0 gpest220 0 0 gpest221 0 0 gpest222 0 0 gpest223 0 0 gpest224 0 0 gpest225 0 0 gpest226 0 0 gpest227 0 0 gpest228 0 0 gpest229 0 0 gpest23 0 0 gpest230 0 0 gpest231 0 0 gpest232 0 0 gpest233 0 0 gpest234 0 0 gpest235 0 0 gpest236 0 0 gpest237 0 0 gpest238 0 0 gpest239 0 0 gpest24 0 0 gpest240 0 0 gpest241 0 0 gpest242 0 0 gpest243 0 0 gpest244 0 0 gpest245 0 0 gpest246 0 0 gpest247 0 0 gpest248 0 0 gpest249 0 0 gpest25 0 0 gpest250 0 0 gpest251 0 0 gpest252 0 0 gpest253 0 0 gpest254 0 0 gpest255 0 0 gpest256 0 0 gpest257 0 0 gpest258 0 0 gpest259 0 0 gpest26 0 0 gpest260 0 0 gpest261 0 0 gpest262 0 0 gpest263 0 0 gpest264 0 0 gpest265 0 0 gpest266 0 0 gpest267 0 0 gpest268 0 0 gpest269 0 0 gpest27 0 0 gpest270 0 0 gpest271 0 0 gpest272 0 0 gpest273 0 0 gpest274 0 0 gpest275 0 0 gpest276 0 0 gpest277 0 0 gpest278 0 0 gpest279 0 0 gpest28 0 0 gpest280 0 0 gpest281 0 0 gpest282 0 0 gpest283 0 0 gpest284 0 0 gpest285 0 0 gpest286 0 0 gpest287 0 0 gpest288 0 0 gpest289 0 0 gpest29 0 0 gpest290 0 0 gpest291 0 0 gpest292 0 0 gpest293 0 0 gpest294 0 0 gpest295 0 0 gpest296 0 0 gpest297 0 0 gpest298 0 0 gpest299 0 0 gpest3 0 0 gpest30 0 0 gpest300 0 0 gpest301 0 0 gpest302 0 0 gpest303 0 0 gpest304 0 0 gpest305 0 0 gpest306 0 0 gpest307 0 0 gpest308 0 0 gpest309 0 0 gpest31 0 0 gpest310 0 0 gpest311 0 0 gpest312 0 0 gpest313 0 0 gpest314 0 0 gpest315 0 0 gpest316 0 0 gpest317 0 0 gpest318 0 0 gpest319 0 0 gpest32 0 0 gpest320 0 0 gpest321 0 0 gpest322 0 0 gpest323 0 0 gpest324 0 0 gpest325 0 0 gpest326 0 0 gpest327 0 0 gpest328 0 0 gpest329 0 0 gpest33 0 0 gpest330 0 0 gpest331 0 0 gpest332 0 0 gpest333 0 0 gpest334 0 0 gpest335 0 0 gpest336 0 0 gpest337 0 0 gpest338 0 0 gpest339 0 0 gpest34 0 0 gpest340 0 0 gpest341 0 0 gpest342 0 0 gpest343 0 0 gpest344 0 0 gpest345 0 0 gpest346 0 0 gpest347 0 0 gpest348 0 0 gpest349 0 0 gpest35 0 0 gpest350 0 0 gpest351 0 0 gpest352 0 0 gpest353 0 0 gpest354 0 0 gpest355 0 0 gpest356 0 0 gpest357 0 0 gpest358 0 0 gpest359 0 0 gpest36 0 0 gpest360 0 0 gpest361 0 0 gpest362 0 0 gpest363 0 0 gpest364 0 0 gpest365 0 0 gpest366 0 0 gpest367 0 0 gpest368 0 0 gpest369 0 0 gpest37 0 0 gpest370 0 0 gpest371 0 0 gpest372 0 0 gpest373 0 0 gpest374 0 0 gpest375 0 0 gpest376 0 0 gpest377 0 0 gpest378 0 0 gpest379 0 0 gpest38 0 0 gpest380 0 0 gpest381 0 0 gpest382 0 0 gpest383 0 0 gpest384 0 0 gpest385 0 0 gpest386 0 0 gpest387 0 0 gpest388 0 0 gpest389 0 0 gpest39 0 0 gpest390 0 0 gpest391 0 0 gpest392 0 0 gpest393 0 0 gpest394 0 0 gpest395 0 0 gpest396 0 0 gpest397 0 0 gpest398 0 0 gpest399 0 0 gpest4 0 0 gpest40 0 0 gpest400 0 0 gpest401 0 0 gpest402 0 0 gpest403 0 0 gpest404 0 0 gpest405 0 0 gpest406 0 0 gpest407 0 0 gpest408 0 0 gpest409 0 0 gpest41 0 0 gpest410 0 0 gpest411 0 0 gpest412 0 0 gpest413 0 0 gpest414 0 0 gpest415 0 0 gpest416 0 0 gpest417 0 0 gpest418 0 0 gpest419 0 0 gpest42 0 0 gpest420 0 0 gpest421 0 0 gpest422 0 0 gpest423 0 0 gpest424 0 0 gpest425 0 0 gpest426 0 0 gpest427 0 0 gpest428 0 0 gpest429 0 0 gpest43 0 0 gpest430 0 0 gpest431 0 0 gpest432 0 0 gpest433 0 0 gpest434 0 0 gpest435 0 0 gpest436 0 0 gpest437 0 0 gpest438 0 0 gpest439 0 0 gpest44 0 0 gpest440 0 0 gpest441 0 0 gpest442 0 0 gpest443 0 0 gpest444 0 0 gpest445 0 0 gpest446 0 0 gpest447 0 0 gpest448 0 0 gpest449 0 0 gpest45 0 0 gpest450 0 0 gpest451 0 0 gpest452 0 0 gpest453 0 0 gpest454 0 0 gpest455 0 0 gpest456 0 0 gpest457 0 0 gpest458 0 0 gpest459 0 0 gpest46 0 0 gpest460 0 0 gpest461 0 0 gpest47 0 0 gpest48 0 0 gpest49 0 0 gpest5 0 0 gpest50 0 0 gpest51 0 0 gpest52 0 0 gpest53 0 0 gpest54 0 0 gpest55 0 0 gpest56 0 0 gpest57 0 0 gpest58 0 0 gpest59 0 0 gpest6 0 0 gpest60 0 0 gpest61 0 0 gpest62 0 0 gpest63 0 0 gpest64 0 0 gpest65 0 0 gpest66 0 0 gpest67 0 0 gpest68 0 0 gpest69 0 0 gpest7 0 0 gpest70 0 0 gpest71 0 0 gpest72 0 0 gpest73 0 0 gpest74 0 0 gpest75 0 0 gpest76 0 0 gpest77 0 0 gpest78 0 0 gpest79 0 0 gpest8 0 0 gpest80 0 0 gpest81 0 0 gpest82 0 0 gpest83 0 0 gpest84 0 0 gpest85 0 0 gpest86 0 0 gpest87 0 0 gpest88 0 0 gpest89 0 0 gpest9 0 0 gpest90 0 0 gpest91 0 0 gpest92 0 0 gpest93 0 0 gpest94 0 0 gpest95 0 0 gpest96 0 0 gpest97 0 0 gpest98 0 0 gpest99 0 0 gpgss1 0 0 gpgss10 0 0 gpgss100 0 0 gpgss101 0 0 gpgss102 0 0 gpgss103 0 0 gpgss104 0 0 gpgss105 0 0 gpgss106 0 0 gpgss107 0 0 gpgss108 0 0 gpgss109 0 0 gpgss11 0 0 gpgss110 0 0 gpgss111 0 0 gpgss112 0 0 gpgss113 0 0 gpgss114 0 0 gpgss115 0 0 gpgss116 0 0 gpgss117 0 0 gpgss118 0 0 gpgss119 0 0 gpgss12 0 0 gpgss120 0 0 gpgss121 0 0 gpgss122 0 0 gpgss123 0 0 gpgss124 0 0 gpgss125 0 0 gpgss126 0 0 gpgss127 0 0 gpgss128 0 0 gpgss129 0 0 gpgss13 0 0 gpgss130 0 0 gpgss131 0 0 gpgss132 0 0 gpgss133 0 0 gpgss134 0 0 gpgss135 0 0 gpgss136 0 0 gpgss137 0 0 gpgss138 0 0 gpgss139 0 0 gpgss14 0 0 gpgss140 0 0 gpgss141 0 0 gpgss142 0 0 gpgss143 0 0 gpgss144 0 0 gpgss145 0 0 gpgss146 0 0 gpgss147 0 0 gpgss148 0 0 gpgss149 0 0 gpgss15 0 0 gpgss150 0 0 gpgss151 0 0 gpgss152 0 0 gpgss153 0 0 gpgss154 0 0 gpgss155 0 0 gpgss156 0 0 gpgss157 0 0 gpgss158 0 0 gpgss159 0 0 gpgss16 0 0 gpgss160 0 0 gpgss161 0 0 gpgss162 0 0 gpgss163 0 0 gpgss164 0 0 gpgss165 0 0 gpgss166 0 0 gpgss167 0 0 gpgss168 0 0 gpgss169 0 0 gpgss17 0 0 gpgss170 0 0 gpgss171 0 0 gpgss172 0 0 gpgss173 0 0 gpgss174 0 0 gpgss175 0 0 gpgss176 61 12916 gpgss177 0 0 gpgss178 0 0 gpgss179 0 0 gpgss18 0 0 gpgss180 0 0 gpgss181 0 0 gpgss182 0 0 gpgss183 0 0 gpgss184 0 0 gpgss185 0 0 gpgss186 0 0 gpgss187 0 0 gpgss188 0 0 gpgss189 0 0 gpgss19 0 0 gpgss190 0 0 gpgss191 0 0 gpgss192 0 0 gpgss193 0 0 gpgss194 0 0 gpgss195 0 0 gpgss196 0 0 gpgss197 0 0 gpgss198 0 0 gpgss199 0 0 gpgss2 0 0 gpgss20 0 0 gpgss200 0 0 gpgss201 0 0 gpgss202 0 0 gpgss203 0 0 gpgss204 0 0 gpgss205 0 0 gpgss206 0 0 gpgss207 0 0 gpgss208 0 0 gpgss209 0 0 gpgss21 0 0 gpgss210 0 0 gpgss211 0 0 gpgss212 0 0 gpgss213 0 0 gpgss214 0 0 gpgss215 0 0 gpgss216 0 0 gpgss217 0 0 gpgss218 0 0 gpgss219 0 0 gpgss22 0 0 gpgss220 0 0 gpgss221 0 0 gpgss222 0 0 gpgss223 0 0 gpgss224 0 0 gpgss225 0 0 gpgss226 0 0 gpgss227 0 0 gpgss228 0 0 gpgss229 0 0 gpgss23 0 0 gpgss230 0 0 gpgss231 0 0 gpgss232 0 0 gpgss233 0 0 gpgss234 0 0 gpgss235 0 0 gpgss236 0 0 gpgss237 0 0 gpgss238 0 0 gpgss239 0 0 gpgss24 0 0 gpgss240 0 0 gpgss241 0 0 gpgss242 0 0 gpgss243 0 0 gpgss244 0 0 gpgss245 0 0 gpgss246 0 0 gpgss247 0 0 gpgss248 0 0 gpgss249 0 0 gpgss25 0 0 gpgss250 0 0 gpgss251 0 0 gpgss252 0 0 gpgss253 0 0 gpgss254 0 0 gpgss255 0 0 gpgss26 0 0 gpgss27 0 0 gpgss28 0 0 gpgss29 0 0 gpgss3 0 0 gpgss30 0 0 gpgss31 0 0 gpgss32 0 0 gpgss33 0 0 gpgss34 0 0 gpgss35 0 0 gpgss36 0 0 gpgss37 0 0 gpgss38 0 0 gpgss39 0 0 gpgss4 0 0 gpgss40 0 0 gpgss41 0 0 gpgss42 0 0 gpgss43 0 0 gpgss44 0 0 gpgss45 0 0 gpgss46 0 0 gpgss47 0 0 gpgss48 0 0 gpgss49 0 0 gpgss5 0 0 gpgss50 0 0 gpgss51 0 0 gpgss52 0 0 gpgss53 0 0 gpgss54 0 0 gpgss55 0 0 gpgss56 0 0 gpgss57 0 0 gpgss58 0 0 gpgss59 0 0 gpgss6 0 0 gpgss60 0 0 gpgss61 0 0 gpgss62 0 0 gpgss63 0 0 gpgss64 0 0 gpgss65 0 0 gpgss66 0 0 gpgss67 0 0 gpgss68 0 0 gpgss69 0 0 gpgss7 0 0 gpgss70 0 0 gpgss71 0 0 gpgss72 0 0 gpgss73 0 0 gpgss74 0 0 gpgss75 0 0 gpgss76 0 0 gpgss77 0 0 gpgss78 0 0 gpgss79 0 0 gpgss8 0 0 gpgss80 0 0 gpgss81 0 0 gpgss82 0 0 gpgss83 0 0 gpgss84 0 0 gpgss85 0 0 gpgss86 0 0 gpgss87 0 0 gpgss88 0 0 gpgss89 0 0 gpgss9 0 0 gpgss90 0 0 gpgss91 0 0 gpgss92 0 0 gpgss93 0 0 gpgss94 0 0 gpgss95 0 0 gpgss96 0 0 gpgss97 0 0 gpgss98 0 0 gpgss99 0 0 gphtc1 9654 2386776 gphtc10 15558 3600866 gphtc11 713 352883 gphtc12 0 0 gphtc13 105 56038 gphtc14 11168 3445344 gphtc15 14180 2562604 gphtc2 6112 2412864 gphtc3 5947 2345170 gphtc4 5750 2203727 gphtc5 7594 3385832 gphtc6 9233 3766862 gphtc7 3999 1626363 gphtc8 37 13171 gphtc9 3516 1381819 gphtg1 1 712 gphtg10 0 0 gphtg100 0 0 gphtg101 0 0 gphtg102 0 0 gphtg103 0 0 gphtg104 0 0 gphtg105 0 0 gphtg106 0 0 gphtg107 0 0 gphtg108 0 0 gphtg109 0 0 gphtg11 0 0 gphtg110 0 0 gphtg111 0 0 gphtg112 94 52971 gphtg113 5220 1344361 gphtg114 0 0 gphtg115 11522 3150969 gphtg116 2083 538683 gphtg117 12464 5828662 gphtg118 16 7139 gphtg119 0 0 gphtg12 0 0 gphtg120 0 0 gphtg121 4268 1240469 gphtg122 0 0 gphtg123 0 0 gphtg124 0 0 gphtg125 0 0 gphtg126 0 0 gphtg127 0 0 gphtg128 0 0 gphtg129 0 0 gphtg13 0 0 gphtg130 333 109804 gphtg131 119 52330 gphtg132 0 0 gphtg133 0 0 gphtg134 1 438 gphtg135 3642 1061325 gphtg136 6607 1841375 gphtg14 0 0 gphtg15 0 0 gphtg16 7 3848 gphtg17 0 0 gphtg18 0 0 gphtg19 0 0 gphtg2 0 0 gphtg20 26 14030 gphtg21 0 0 gphtg22 0 0 gphtg23 0 0 gphtg24 0 0 gphtg25 0 0 gphtg26 0 0 gphtg27 23 8608 gphtg28 1 234 gphtg29 19 15452 gphtg3 0 0 gphtg30 0 0 gphtg31 0 0 gphtg32 0 0 gphtg33 23 11441 gphtg34 0 0 gphtg35 0 0 gphtg36 0 0 gphtg37 0 0 gphtg38 0 0 gphtg39 0 0 gphtg4 0 0 gphtg40 0 0 gphtg41 0 0 gphtg42 0 0 gphtg43 0 0 gphtg44 0 0 gphtg45 0 0 gphtg46 0 0 gphtg47 33 14631 gphtg48 1 548 gphtg49 0 0 gphtg5 0 0 gphtg50 0 0 gphtg51 0 0 gphtg52 0 0 gphtg53 0 0 gphtg54 0 0 gphtg55 1 243 gphtg56 0 0 gphtg57 0 0 gphtg58 0 0 gphtg59 0 0 gphtg6 0 0 gphtg60 0 0 gphtg61 0 0 gphtg62 0 0 gphtg63 143 30463 gphtg64 0 0 gphtg65 0 0 gphtg66 0 0 gphtg67 0 0 gphtg68 0 0 gphtg69 0 0 gphtg7 0 0 gphtg70 0 0 gphtg71 0 0 gphtg72 0 0 gphtg73 316 99716 gphtg74 0 0 gphtg75 59 22011 gphtg76 0 0 gphtg77 0 0 gphtg78 71 22015 gphtg79 0 0 gphtg8 0 0 gphtg80 0 0 gphtg81 0 0 gphtg82 0 0 gphtg83 0 0 gphtg84 0 0 gphtg85 0 0 gphtg86 0 0 gphtg87 0 0 gphtg88 0 0 gphtg89 0 0 gphtg9 0 0 gphtg90 0 0 gphtg91 0 0 gphtg92 0 0 gphtg93 0 0 gphtg94 0 0 gphtg95 0 0 gphtg96 0 0 gphtg97 0 0 gphtg98 0 0 gphtg99 0 0 gpinv1 29908 8814235 gpinv10 44335 13438777 gpinv11 50656 11403387 gpinv12 32722 7646332 gpinv13 51126 11158855 gpinv14 52294 12022693 gpinv15 56050 13274485 gpinv16 32053 7313145 gpinv17 39796 13650221 gpinv18 39301 23846293 gpinv19 44932 15684756 gpinv2 15324 5650116 gpinv20 67313 14309195 gpinv21 56769 12450921 gpinv22 30044 10589241 gpinv23 4174 2028911 gpinv24 43922 9902365 gpinv25 60321 12712288 gpinv26 19075 4568370 gpinv27 60954 12953662 gpinv28 58965 12465568 gpinv29 65825 14372571 gpinv3 3813 2079959 gpinv30 29224 10915650 gpinv4 28632 17768457 gpinv5 47502 12654713 gpinv6 26496 8728370 gpinv7 43800 14217240 gpinv8 2927 675007 gpinv9 48728 11513455 gpmam1 7994 2135844 gpmam2 9826 2279581 gpmam3 38891 10571893 gpmam4 3981 979028 gpmam5 37208 9230731 gpmam6 19624 4375282 gpmam7 49418 12156309 gpmam8 468 130578 gppat1 5117 1502025 gppat10 3915 1361347 gppat100 0 0 gppat101 0 0 gppat102 0 0 gppat103 0 0 gppat104 0 0 gppat105 0 0 gppat106 0 0 gppat107 2317 629273 gppat108 258 106444 gppat109 0 0 gppat11 3063 1205921 gppat110 1272 526233 gppat111 418 153327 gppat112 6421 2484390 gppat113 15834 6188151 gppat114 0 0 gppat115 0 0 gppat116 0 0 gppat117 0 0 gppat118 0 0 gppat119 0 0 gppat12 11551 2100922 gppat120 0 0 gppat121 0 0 gppat122 0 0 gppat123 0 0 gppat124 0 0 gppat125 0 0 gppat126 0 0 gppat127 0 0 gppat128 0 0 gppat129 0 0 gppat13 0 0 gppat130 0 0 gppat131 0 0 gppat132 0 0 gppat133 0 0 gppat134 0 0 gppat135 0 0 gppat136 0 0 gppat137 0 0 gppat138 0 0 gppat139 0 0 gppat14 0 0 gppat140 0 0 gppat141 0 0 gppat142 0 0 gppat143 0 0 gppat144 68 29006 gppat145 0 0 gppat146 0 0 gppat147 2602 895934 gppat148 699 339462 gppat149 0 0 gppat15 0 0 gppat150 57527 26705485 gppat151 61953 28458415 gppat152 19796 7623062 gppat153 907 421386 gppat154 11219 4270766 gppat155 46872 20113407 gppat156 19168 7377825 gppat157 0 0 gppat158 0 0 gppat159 0 0 gppat16 0 0 gppat160 0 0 gppat161 0 0 gppat162 3356 1267088 gppat163 3445 1127323 gppat164 0 0 gppat165 183 75356 gppat166 133 16947 gppat167 1156 469337 gppat168 1050 448888 gppat169 0 0 gppat17 0 0 gppat170 0 0 gppat171 0 0 gppat172 0 0 gppat173 0 0 gppat174 0 0 gppat175 112 102735 gppat176 1726 729424 gppat177 491 214558 gppat178 449 199471 gppat18 0 0 gppat19 0 0 gppat2 0 0 gppat20 0 0 gppat21 2751 1066337 gppat22 2071 732601 gppat23 3974 1283047 gppat24 5909 1976221 gppat25 3781 1503628 gppat26 192 63977 gppat27 0 0 gppat28 0 0 gppat29 0 0 gppat3 0 0 gppat30 0 0 gppat31 0 0 gppat32 0 0 gppat33 0 0 gppat34 0 0 gppat35 0 0 gppat36 0 0 gppat37 0 0 gppat38 0 0 gppat39 0 0 gppat4 0 0 gppat40 0 0 gppat41 0 0 gppat42 0 0 gppat43 0 0 gppat44 0 0 gppat45 0 0 gppat46 0 0 gppat47 0 0 gppat48 0 0 gppat49 0 0 gppat5 0 0 gppat50 0 0 gppat51 0 0 gppat52 0 0 gppat53 0 0 gppat54 0 0 gppat55 0 0 gppat56 2655 873899 gppat57 1049 517498 gppat58 51259 23192000 gppat59 17404 8883168 gppat6 0 0 gppat60 0 0 gppat61 0 0 gppat62 0 0 gppat63 0 0 gppat64 0 0 gppat65 0 0 gppat66 0 0 gppat67 0 0 gppat68 0 0 gppat69 0 0 gppat7 7025 2311444 gppat70 0 0 gppat71 0 0 gppat72 0 0 gppat73 0 0 gppat74 0 0 gppat75 0 0 gppat76 0 0 gppat77 0 0 gppat78 0 0 gppat79 0 0 gppat8 3107 1159563 gppat80 0 0 gppat81 0 0 gppat82 0 0 gppat83 0 0 gppat84 0 0 gppat85 0 0 gppat86 0 0 gppat87 0 0 gppat88 0 0 gppat89 0 0 gppat9 3907 1645335 gppat90 0 0 gppat91 0 0 gppat92 0 0 gppat93 0 0 gppat94 0 0 gppat95 0 0 gppat96 0 0 gppat97 0 0 gppat98 0 0 gppat99 0 0 gpphg1 110325 22804792 gppln1 34593 11827215 gppln10 28052 8825183 gppln11 15699 4901537 gppln12 24678 8486305 gppln13 32777 13917646 gppln14 23828 9519302 gppln15 7246 3505419 gppln16 6091 2897504 gppln17 6114 2925188 gppln18 17174 6243289 gppln19 30192 10067914 gppln2 33385 11465228 gppln20 20679 5546716 gppln21 20165 7818291 gppln22 4725 1687292 gppln23 36384 10600258 gppln24 11548 3524164 gppln25 31291 8203393 gppln26 45132 13726895 gppln27 44453 15556038 gppln28 32307 14230769 gppln29 27835 12044049 gppln3 14057 6235217 gppln30 40945 19685492 gppln31 13534 6529146 gppln32 35395 11591443 gppln33 33141 8529721 gppln34 26757 6333104 gppln35 34030 7761784 gppln36 33162 8026955 gppln37 34886 7643946 gppln38 38958 8532231 gppln39 14450 2219489 gppln4 5661 1888764 gppln40 34871 4919830 gppln41 37331 7294362 gppln42 33056 11570212 gppln43 33846 15563815 gppln44 39427 15540564 gppln45 22352 4325037 gppln46 28921 5929047 gppln47 41374 11734573 gppln48 51891 18272338 gppln49 37369 7910604 gppln5 210 73365 gppln50 26721 5582775 gppln51 32012 7094691 gppln52 37033 7722946 gppln53 38922 9037140 gppln54 43906 13844096 gppln55 9662 4070200 gppln6 0 0 gppln7 17554 7975153 gppln8 30383 9083633 gppln9 4126 1205394 gppri1 17561 6200880 gppri10 39 12519 gppri11 121 37760 gppri12 161 47964 gppri13 129 49780 gppri14 88 27447 gppri15 33 9722 gppri16 9 1906 gppri17 0 0 gppri18 0 0 gppri19 0 0 gppri2 9998 2886729 gppri20 0 0 gppri21 0 0 gppri22 8891 2573557 gppri23 2663 989201 gppri24 19853 5790988 gppri25 20246 5624789 gppri26 19435 8479154 gppri27 9672 4473662 gppri28 3164 1703974 gppri29 2535 1445612 gppri3 550 209050 gppri30 2678 1539472 gppri31 2728 1599926 gppri32 12152 2653561 gppri33 52 27372 gppri34 37839 10500512 gppri35 13495 2822047 gppri36 23339 9120225 gppri37 15040 7559063 gppri38 21262 5921747 gppri39 38709 8954921 gppri4 197 81442 gppri40 51252 13493588 gppri41 26549 6678232 gppri42 55406 14282653 gppri43 78741 22483316 gppri44 21070 6326505 gppri45 66 35055 gppri5 248 102405 gppri6 212 77202 gppri7 25 11228 gppri8 109 41693 gppri9 283 92663 gprod1 12064 3545133 gprod10 0 0 gprod11 0 0 gprod12 0 0 gprod13 0 0 gprod14 0 0 gprod15 0 0 gprod16 0 0 gprod17 0 0 gprod18 0 0 gprod19 11984 4134908 gprod2 0 0 gprod20 16571 6859553 gprod21 3625 1895300 gprod22 2971 1525700 gprod23 11354 4138460 gprod24 30853 12041303 gprod25 19324 8669116 gprod26 1839 885849 gprod27 7741 2190630 gprod28 27694 6516623 gprod29 38461 10354802 gprod3 4 2148 gprod4 1 38 gprod5 1 120 gprod6 0 0 gprod7 0 0 gprod8 0 0 gprod9 0 0 gpsts1 1 66 gpsts10 0 0 gpsts11 0 0 gpsts12 0 0 gpsts13 0 0 gpsts14 0 0 gpsts15 0 0 gpsts16 0 0 gpsts17 4 432 gpsts18 0 0 gpsts19 0 0 gpsts2 0 0 gpsts20 4 314 gpsts3 0 0 gpsts4 0 0 gpsts5 0 0 gpsts6 0 0 gpsts7 0 0 gpsts8 0 0 gpsts9 0 0 gpsyn1 39399 15982578 gpsyn2 27662 10541044 gpsyn3 11344 3406306 gpsyn4 0 0 gpsyn5 0 0 gpsyn6 0 0 gpsyn7 2444 682113 gptsa1 1038 194157 gptsa10 26 6242 gptsa11 1425 398554 gptsa12 0 0 gptsa13 0 0 gptsa14 0 0 gptsa15 294 55067 gptsa16 0 0 gptsa17 0 0 gptsa18 1 539 gptsa19 66 10060 gptsa2 166 28119 gptsa20 3 662 gptsa21 0 0 gptsa22 93 15460 gptsa23 0 0 gptsa24 0 0 gptsa25 10075 5045981 gptsa26 140 9714 gptsa27 0 0 gptsa28 9 2903 gptsa29 213 38094 gptsa3 0 0 gptsa30 0 0 gptsa31 302 102145 gptsa32 0 0 gptsa33 0 0 gptsa34 0 0 gptsa35 0 0 gptsa36 0 0 gptsa37 4 622 gptsa38 1 327 gptsa39 0 0 gptsa4 3284 894798 gptsa40 0 0 gptsa41 0 0 gptsa42 0 0 gptsa43 152 22361 gptsa44 25 3514 gptsa45 0 0 gptsa46 0 0 gptsa47 0 0 gptsa48 0 0 gptsa49 960 269526 gptsa5 0 0 gptsa50 0 0 gptsa51 0 0 gptsa52 4849 1394562 gptsa53 0 0 gptsa54 0 0 gptsa55 19062 5233659 gptsa56 2216 356560 gptsa57 0 0 gptsa58 0 0 gptsa59 93 14399 gptsa6 12 2307 gptsa60 30 8247 gptsa61 0 0 gptsa62 0 0 gptsa63 0 0 gptsa64 0 0 gptsa65 0 0 gptsa66 0 0 gptsa67 0 0 gptsa68 0 0 gptsa69 18545 7976435 gptsa7 0 0 gptsa70 6938 3890179 gptsa8 0 0 gptsa9 60 14088 gpuna1 52 7598 gpvrl1 80410 21182488 gpvrl10 72329 21722485 gpvrl11 50567 15960076 gpvrl12 71649 22309885 gpvrl13 70543 23688242 gpvrl14 72357 20571490 gpvrl15 75149 23747248 gpvrl16 73194 22893388 gpvrl17 75770 22644751 gpvrl18 70226 22323413 gpvrl19 74121 23304627 gpvrl2 85147 19767229 gpvrl20 44859 11157362 gpvrl3 76785 18600994 gpvrl4 79481 20422313 gpvrl5 55061 15809013 gpvrl6 63392 26333991 gpvrl7 60930 24301525 gpvrl8 77440 22688953 gpvrl9 79782 20356447 gpvrt1 25714 7560681 gpvrt10 1826 927136 gpvrt11 1975 944324 gpvrt12 916 427093 gpvrt13 2463 945725 gpvrt14 2551 1086818 gpvrt15 1382 602455 gpvrt16 20694 4807945 gpvrt17 49377 11424645 gpvrt18 52637 11970392 gpvrt19 21006 4455096 gpvrt2 464 73649 gpvrt20 45624 10830101 gpvrt21 22581 5140567 gpvrt22 52545 11681397 gpvrt23 53139 12061024 gpvrt24 53421 11425815 gpvrt25 18179 3750296 gpvrt26 17586 4615938 gpvrt3 44055 10910081 gpvrt4 13222 3992610 gpvrt5 21944 5095792 gpvrt6 46029 12177466 gpvrt7 29124 11786803 gpvrt8 27325 8491989 gpvrt9 1699 797286 3. FILE FORMAT 3.1 File Header Information All of the files in the distribution begin with the same header, except for the first line, which contains the division name, the fifth line, which contains a description of the file contents, and the seventh line which contains the number of loci and residues in the file. The first line of the file contains the division name in character positions 1 to 9 and the full data bank name (Genetic Sequence Data Bank) starting in column 16. The brief names of the files in this release are listed in section 2.1. The second line contains the date of the current release in the form month-day-year, "MM-DD-YYYY". The fourth line contains the current GenPept release number. The release number consists of two numbers separated by a decimal point. The number to the left of the decimal is the major release number. The digit to the right of the decimal indicates the version of the major release; it is zero for the first version. The fifth line contains a title for the file. The seventh line lists the number of entries (loci) and the number of residues in this release of GenPept. 1-------10--------20--------30--------40--------50--------60--------70------78 gpbct1 Genetic Sequence Data Bank 04-22-2012 GenPept Release 189.0 Translated Protein-coding Sequences 54877 loci containing 15749170 residues 1-------10--------20--------30--------40--------50--------60--------70------78 Example 1. Sample File Header 3.2 Sequence Entry Files GenPept entries are derived from entries in the GenBank nucleotide sequence data bank. They contain minimal annotation, primarily extracted from the corresponding GenBank entries. For the complete annotations, refer to the GenBank entry or entries referenced by the accession number(s) in the GenPept entry. 3.2.1 Entry Organization Each record (one record = one line) consists of two parts. The first part is found in positions 1 to 10 and may contain: 1. A keyword, beginning in column 1 of the record (e.g., DEFINITION is a keyword). 2. Blank characters, indicating that this record is a continuation of the information under the keyword above it. 3. A number, ending in column 9 of the record. This number occurs in the portion of the entry containing the actual amino acid sequence and designates the numbering of sequence positions. 4. Two slashes (//) in positions 1 and 2, marking the end of an entry. The second part of each sequence entry record (line) contains the information appropriate to its keyword, in positions 13 to 80 for keywords and positions 11 to 80 for the sequence. The following is a brief description of each entry field. LOCUS The entry name. It consists the accession number of the GenBank nucleotide sequence entry (or entries) from which this product was translated, followed by an underscore character ( _ ) and a number indicating which coding region (CDS) in the feature table of the original GenBank entry was used for this translation. The number is determined by assigning a number to each CDS according to its order of appearance in the original GenBank entry's feature table. Detailed format for the LOCUS line: Positions Contents --------- -------- 01-05 'LOCUS' 06-12 spaces 13-25 GenPept Locus name 26-26 space 27-42 GenBank Locus name 43-43 space 44-49 Length of peptide sequence 50-50 space 51-52 'aa' 53-53 space 54-56 'PEP' 57-57 space 58-63 'linear' 64-64 space 65-67 GenBank division code 68-68 space 69-79 Date, in format dd-mmm-yyyy DEFINITION This field contains the feature as it appeared in the original GenBank entry that was translated to produce the sequence in this GenPept entry. If the GenBank CDS had a "/note" qualifier, the text of this qualifier is placed on a continuation line as part of the GenPept DEFINITION record. DATE Entry date for the GenBank locus used to create this record. ACCESSION Primary accession numbers of all the GenBank entries from which this GenPept entry was created. VERSION A compound identifier consisting of the GenPept Locus and a numeric version number associated with the current version of the sequence data in the record. This is followed by an integer key (a "GI") assigned to the peptide sequence. Mandatory keyword/exactly one record. KEYWORDS Short phrases describing gene products and other information, taken directly from the corresponding GenBank entry. Mandatory keyword in all annotated entries/one or more records. SOURCE Common name of the organism or the name most frequently used in the literature. Mandatory keyword in all annotated entries/one or more records/includes one subkeyword. ORGANISM Source organism of the nucleic acid sequence COMMENT This field identifies the coding regions translated to make this protein. It reproduces the relevant lines from the Feature tables of the GenBank data bank entries. WEIGHT Protein molecular weight calculated from the sequence. PI Isoelectric point. Mandatory keyword/exactly one record. LENGTH Protein length in amino acid residues. ORIGIN Indication of codon phase used in translation The sequence immediately follows the ORIGIN line. It uses the IUPAC-IUB one-letter amino acid codes (see Appendix A). The first 9 columns in each line are reserved for a right-justified integer representing the residue number of the first amino acid on the line. Column 10 is blank and the sequence begins in column 11. The sequence is presented with up to 60 residues per line, in groups of 10 residues separated by spaces. Note that "?"s in GenBank entries' /translation qualifier sequences are converted to "X"s in GenPept. Residues are in uppercase. // A double slash marks the end of each entry. The next entry begins on the following line. 3.2.2 Sample Sequence Data File An example of a complete sequence entry follows. 1-------10--------20--------30--------40--------50--------60--------70------78 LOCUS AB000100_1 AB000100 263 aa PEP linear BCT 15-MAY-2009 DEFINITION Synechococcus elongatus PCC 7942 genes for intrinsic membrane protein, malK-like protein, cyanase, complete cds. DATE 15-MAY-2009 ACCESSION AB000100 VERSION AB000100_1.1 GI:2330515 KEYWORDS . SOURCE Synechococcus elongatus PCC 7942 ORGANISM Synechococcus elongatus PCC 7942 Bacteria; Cyanobacteria; Chroococcales; Synechococcus. COMMENT CDS 121..912 /gene="cynB" /transl_table=11 /product="intrinsic membrane protein" /protein_id="BAA21794.1" /db_xref="GI:2330515" /NucGI="2330514" WEIGHT 28647.67 PI 9.76 LENGTH 263 ORIGIN Translated using phase 1 1 MVRTPVPLYL RWAVSILSVL AFLAIWQIAA ASGFLGKTFP GSLRTLQDLF GWLSDPFFDN 61 GPNDLGIGWN LLISLRRVAI GYLLATVVAI PLGIAIGMSA LASSIFSPFV QLLKPVSPLA 121 WLPIGLFLFR DSELTGVFVI LISSLWPTLI NTAFGVANVN PDFLKVSQSL GASRWRTILK 181 VILPAALPSI IAGMRISMGI AWLVIVAAEM LLGTGIGYFI WNEWNNLSLP NIFSAIIIIG 241 IVGILLDQGF RFLENQFSYA GNR // 1-------10--------20--------30--------40--------50--------60--------70------78 Example 2. Sample Sequence Data File 4 Trademarks, citations, etc. 4.1 Registered Trademark Notices GenBank (R) is a registered trademark of the U.S. Department of Health and Human Services for the Genetic Sequence Data Bank. GenPept (R) is a registered trademark of the U.S. Department of Health and Human Services for the GenBank Gene Products Data Bank. 4.2 Citing GenPept If you have used GenPept in your research, please include a reference to the database in all publications related to that research. For instance: 1. GenPept (GenBank Gene Products) Database. Distributed on the Internet via anonymous FTP from ftp.ncifcrf.gov, under the auspices of the National Cancer Institute's Advanced Biomedical Computing Center. When citing data in GenPept, it is appropriate to give the sequence name, release number, and the publication in which the parent GenBank sequence first appeared. It is also appropriate to list a reference for GenBank itself, since GenPept is derived from the GenBank data. The following publication, which describes the GenBank data bank, should be cited: Burks, C., Cassidy, M., Cinkosky, M.J., Cumella, K.E., Gilna, P., Hayden, J.E-D., Keen, G.M., Kelley, T.A., Kelly, M., Kristofferson, D., and Ryals, J. GenBank. Nucl. Acids Res. 19 (Suppl):2221-2225(1991) 4.3 GenPept Distribution Format The GenPept data bank is available by anonymous FTP from ftp.ncifcrf.gov. 4.4 Disclaimer Science Applications International Corp. and the United States Government make no representations or warranties regarding the content or accuracy of this information. Science Applications International Corp, and the United States Government also make no representations or warranties of merchantability or fitness for a particular purpose and accept no responsibility for any consequences of the receipt or use of the information. APPENDIX A - IUPAC-IUB AMINO ACID CODES Code Amino Acid A Alanine (ala) R Arginine (arg) N Asparagine (asn) D Aspartic acid (asp) C Cysteine (cys) Q Glutamine (gln) E Glutamic acid (glu) G Glycine (gly) H Histidine (his) I Isoleucine (ile) L Leucine (leu) K Lysine (lys) M Methionine (met) F Phenylalanine (phe) P Proline (pro) S Serine (ser) T Threonine (thr) U Selenocysteine W Tryptophan (trp) Y Tyrosine (tyr) V Valine (val) B Aspartic acid or Asparagine (asx) Z Glutamic acid or Glutamine (glx) X Any amino acid (xxx)