GATA binding protein 1 (GATA1) - coding DNA reference sequence

(used for mutation description)

(last modified February 2, 2018)


This file was created to facilitate the description of sequence variants in the GATA1 gene based on a coding DNA reference sequence following the HGVS recommendations.

The sequence was taken from VERSION NG_008846.2, covering GATA1 transcript NM_002049.3.


Please note that introns are available by clicking on the exon numbers above the sequence.
 (upstream sequence)
           .         .         .         .         .                g.5051
          caaaggccaaggccagccaggacaccccctgggatcacactgagcttgcca       c.-61

 .         .         .         .         . | 02       .             g.9556
 catccccaaggcggccgaaccctccgcaaccaccagcccag | gttaatccccagaggctcc    c.-1

          .         .         .         .         .         .       g.9616
 ATGGAGTTCCCTGGCCTGGGGTCCCTGGGGACCTCAGAGCCCCTCCCCCAGTTTGTGGAT       c.60
  M  E  F  P  G  L  G  S  L  G  T  S  E  P  L  P  Q  F  V  D        p.20

          .         .         .         .         .         .       g.9676
 CCTGCTCTGGTGTCCTCCACACCAGAATCAGGGGTTTTCTTCCCCTCTGGGCCTGAGGGC       c.120
  P  A  L  V  S  S  T  P  E  S  G  V  F  F  P  S  G  P  E  G        p.40

          .         .         .         .         .         .       g.9736
 TTGGATGCAGCAGCTTCCTCCACTGCCCCGAGCACAGCCACCGCTGCAGCTGCGGCACTG       c.180
  L  D  A  A  A  S  S  T  A  P  S  T  A  T  A  A  A  A  A  L        p.60

          .         .         .         . | 03       .         .    g.10310
 GCCTACTACAGGGACGCTGAGGCCTACAGACACTCCCCAG | TCTTTCAGGTGTACCCATTG    c.240
  A  Y  Y  R  D  A  E  A  Y  R  H  S  P   | V  F  Q  V  Y  P  L     p.80

          .         .         .         .         .         .       g.10370
 CTCAACTGTATGGAGGGGATCCCAGGGGGCTCACCATATGCCGGCTGGGCCTACGGCAAG       c.300
  L  N  C  M  E  G  I  P  G  G  S  P  Y  A  G  W  A  Y  G  K        p.100

          .         .         .         .         .         .       g.10430
 ACGGGGCTCTACCCTGCCTCAACTGTGTGTCCCACCCGCGAGGACTCTCCTCCCCAGGCC       c.360
  T  G  L  Y  P  A  S  T  V  C  P  T  R  E  D  S  P  P  Q  A        p.120

          .         .         .         .         .         .       g.10490
 GTGGAAGATCTGGATGGAAAAGGCAGCACCAGCTTCCTGGAGACTTTGAAGACAGAGCGG       c.420
  V  E  D  L  D  G  K  G  S  T  S  F  L  E  T  L  K  T  E  R        p.140

          .         .         .         .         .         .       g.10550
 CTGAGCCCAGACCTCCTGACCCTGGGACCTGCACTGCCTTCATCACTCCCTGTCCCCAAT       c.480
  L  S  P  D  L  L  T  L  G  P  A  L  P  S  S  L  P  V  P  N        p.160

          .         .         .         .         .         .       g.10610
 AGTGCTTATGGGGGCCCTGACTTTTCCAGTACCTTCTTTTCTCCCACCGGGAGCCCCCTC       c.540
  S  A  Y  G  G  P  D  F  S  S  T  F  F  S  P  T  G  S  P  L        p.180

          .         .         .         .         .         | 04    g.10771
 AATTCAGCAGCCTATTCCTCTCCCAAGCTTCGTGGAACTCTCCCCCTGCCTCCCTGTG | AG    c.600
  N  S  A  A  Y  S  S  P  K  L  R  G  T  L  P  L  P  P  C   | E      p.200

          .         .         .         .         .         .       g.10831
 GCCAGGGAGTGTGTGAACTGCGGAGCAACAGCCACTCCACTGTGGCGGAGGGACAGGACA       c.660
  A  R  E  C  V  N  C  G  A  T  A  T  P  L  W  R  R  D  R  T        p.220

          .         .         .         .         .         .       g.10891
 GGCCACTACCTATGCAACGCCTGCGGCCTCTATCACAAGATGAATGGGCAGAACAGGCCC       c.720
  G  H  Y  L  C  N  A  C  G  L  Y  H  K  M  N  G  Q  N  R  P        p.240

          .         .     | 05   .         .         .         .    g.11654
 CTCATCCGGCCCAAGAAGCGCCTG | ATTGTCAGTAAACGGGCAGGTACTCAGTGCACCAAC    c.780
  L  I  R  P  K  K  R  L  |  I  V  S  K  R  A  G  T  Q  C  T  N     p.260

          .         .         .         .         .         .       g.11714
 TGCCAGACGACCACCACGACACTGTGGCGGAGAAATGCCAGTGGGGATCCCGTGTGCAAT       c.840
  C  Q  T  T  T  T  T  L  W  R  R  N  A  S  G  D  P  V  C  N        p.280

          .         .         . | 06       .         .         .    g.12269
 GCCTGCGGCCTCTACTACAAGCTACACCAG | GTGAACCGGCCACTGACCATGCGGAAGGAT    c.900
  A  C  G  L  Y  Y  K  L  H  Q  |  V  N  R  P  L  T  M  R  K  D     p.300

          .         .         .         .         .         .       g.12329
 GGTATTCAGACTCGAAACCGCAAGGCATCTGGAAAAGGGAAAAAGAAACGGGGCTCCAGT       c.960
  G  I  Q  T  R  N  R  K  A  S  G  K  G  K  K  K  R  G  S  S        p.320

          .         .         .         .         .         .       g.12389
 CTGGGAGGCACAGGAGCAGCCGAAGGACCAGCTGGTGGCTTTATGGTGGTGGCTGGGGGC       c.1020
  L  G  G  T  G  A  A  E  G  P  A  G  G  F  M  V  V  A  G  G        p.340

          .         .         .         .         .         .       g.12449
 AGCGGTAGCGGGAATTGTGGGGAGGTGGCTTCAGGCCTGACACTGGGCCCCCCAGGTACT       c.1080
  S  G  S  G  N  C  G  E  V  A  S  G  L  T  L  G  P  P  G  T        p.360

          .         .         .         .         .         .       g.12509
 GCCCATCTCTACCAAGGCCTGGGCCCTGTGGTGCTGTCAGGGCCTGTTAGCCACCTCATG       c.1140
  A  H  L  Y  Q  G  L  G  P  V  V  L  S  G  P  V  S  H  L  M        p.380

          .         .         .         .         .         .       g.12569
 CCTTTCCCTGGACCCCTACTGGGCTCACCCACGGGCTCCTTCCCCACAGGCCCCATGCCC       c.1200
  P  F  P  G  P  L  L  G  S  P  T  G  S  F  P  T  G  P  M  P        p.400

          .         .         .         .                           g.12611
 CCCACCACCAGCACTACTGTGGTGGCTCCGCTCAGCTCATGA                         c.1242
  P  T  T  S  T  T  V  V  A  P  L  S  S  X                          p.413

          .         .         .         .         .         .       g.12671
 gggcacagagcatggcctccagaggaggggtggtgtccttctcctcttgtagccagaatt       c.*60

          .         .         .         .         .         .       g.12731
 ctggacaacccaagtctctgggccccaggcaccccctggcttgaaccttcaaagcttttg       c.*120

          .         .                                               g.12758
 taaaataaaaccaccaaagtcctgaaa                                        c.*147

 (downstream sequence)
Legend:
Nucleotide numbering (following the rules of the HGVS for a 'Coding DNA Reference Sequence') is indicated at the right of the sequence, counting the A of the ATG translation initiating Methionine as 1. Every 10th nucleotide is indicated by a "." above the sequence. The GATA binding protein 1 protein sequence is shown below the coding DNA sequence, with numbering indicated at the right starting with 1 for the translation initiating Methionine. Every 10th amino acid is shown in bold. The position of introns is indicated by a vertical line, splitting the two exons. The start of the first exon (transcription initiation site) is indicated by a '\', the end of the last exon (poly-A addition site) by a '/'. The exon number is indicated above the first nucleotide(s) of the exon. To aid the description of frame shift mutations, all stop codons in the +1 frame are shown in bold while all stop codons in the +2 frame are underlined.

Powered by LOVD v.2.0 Build 35
©2004-2018 Leiden University Medical Center