sourCEntral - mobile manpages

pdf

SEQFMT

NAME

seqfmt − Sequences formats

DESCRIPTION

This document illustrates some common formats used for sequences representation.
EMBL

 ID   MMVASPHOS  standard; RNA; EST; 140 BP.
 AC   X97897;
 DE   M.musculus mRNA for protein homologous to
 DE   vasodilator−stimulated phosphoprotein
 SQ   Sequence 140 BP; 25 A; 58 C; 39 G; 17 T; 1 other;
      ttctcccaga agctgactct atggngaccc cgagagagac tgagcagaac      60
      ccccgcaccc ctgcacttcc aatcaggggc gccccgggag cactccccgt     120
      ccgccctccg cgcagccatg                                      140
 //

FASTA

 >MMVASPHOS
 ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
 ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
 ccgccctccgcgcagccatg

GCG

 !!NA_SEQUENCE 1.0

(No documentation)
dna1.txt Length: 88 Nov 22, 2001 14:38 Type: N Check: 3818 ..
1 TAGTCGTAGT CGGAGCGATG CTGACGATGA CGATGACGAT CGTAGCTGAT
51 CGATCGAGCT GATGCTGATC GAGCTAGCTG ATCGATCG

GDE

 #sample1

TTCAAGAGAAACAGCGGCCAAGGAAAAGACTCGGCATGATTGTCCATAGCTTACAAAGCG
#sample2
TTCAAGAGAAACAGCGGCTGGGGGAAAGACTCGTCCTGATTGCCTGTAGATGGTAAAGCG

GENBANK

 LOCUS       HUMHBV1       130 bp    DNA         PRI     17−JUN−1993
 DEFINITION  Human DNA/endogenous Hepatitis B virus (HBV) DNA, left
             host viral junction.
 ACCESSION   M15770
 BASE COUNT       32 a     43 c     29 g     26 t
 ORIGIN
       1 agcgggcagt gcagctgctt ggacagcagg ggtgtttctt caacccaggc
      61 ctcctgtcac aacaggccca ttcaattctg aacctgcaag ccaactccaa
     121 cctcttttcc cagggggaac caaaaaccct
 //

IG

 ; comment

U03518
AACCTGCGGAAGGATCATTACCGAGTGCGGGTCCTTTGGGCCCAACCTCCCATCCGTGTC
TATTGTACCCTGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCTCTG
TGAGTTGATTGAATGCAATCAGTTAAAACTTTCAACAATGGATCTCTTGGTTCCGGC1

NBRF (pir)

 >P1;CCHU
 cytochrome c [validated] − human
 MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW
 GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE*

CODATA

 ENTRY           CCHU  #type complete
 TITLE           cytochrome c [validated] − human
 ACCESSIONS      A31764; A05676; I55192; A00001
 SUMMARY         #length 105  #molecular−weight 11749  #checksum 3247
 SEQUENCE
                  5        10        15        20        25        30
        1 M G D V E K G K K I F I M K C S Q C H T V E K G G K H K T G
       31 P N L H G L F G R K T G Q A P G Y S Y T A A N K N K G I I W
       61 G E D T L M E Y L E N P K K Y I P G T K M I F V G I K K K E
       91 E R A D L I A Y L K K A T N E
 ///

RAW

 ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag

ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
ccgccctccgcgcagccatg

Warning: This format cannot handle more than one sequence per file.

SWISSPROT

 ID   100K_RAT       STANDARD;      PRT;   149 AA.
 AC   Q62671;
 DE   100 kDa protein (EC 6.3.2.−).
 SQ   SEQUENCE   149 AA;  17004 MW;  D06484B8BC29112E CRC64;
      MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK
      PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN
      SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV
 //

SEE ALSO

squizz(1), alifmt(5)

AUTHOR

Nicolas Joly (njoly AT pasteur DOT fr), Institut Pasteur.

pdf