Binary Codes for CODONS

Nucleotide Bases Binary Code
A
G
U/T
C
total of polar & (Average - total)   &
total of non-polar & (Average - total)   &
anti-codon : Polar (3'-5')
anti-codon :Non-polar (3'-5')
anti-codon : Polar (5'-3')
anti-codon :Non-polar (5'-3')
aaRS-1(for anticodons)
aaRS-II(for anticodons)
aaRS-1(polar for anti codons)(3'-5')
aaRS-1(non-polar for anti codons)(3'-5')
aaRS-II(polar for anti codons)(3'-5')
aaRS-II(non-polar for anti codons)(3'-5')
aaRS-I (polar for codons)
aaRS-I (non-polar for codons)
aaRS-1(minus stop codon)
aaRS-I (including STOP codon)
aaRS-II (polar for codons)
aaRS-II (non-polar for codons)
aaRS-II (for codons)
Total of non-essential AA
Total of non-essential AA+His-ggg,gga of Gly
Total of essential AA
Total of essential AA-His+ggg,gga of Gly
total 4-fold degenerate
total 2-fold degenerate
total 4-fold degenerate+Trp-ggg,gga of Gly
total 2-fold degenerate-Trp+ggg,gga of Gly
aaRS-II --(anticodons)
           Difference in   ⇒  number  no.of zeros   

 

POLAR AMINO ACIDS [32 codons, cg=44; au=52,ag=56,cu=40 total =96] [c=24,g=20,a=36,u=16][u absent in middle]

In 1st & 2nd position combination of :-- pyrimidine-pyrimidine-4, purine-purine-12, mixed-16. Non repeating codon-14 &  repeating codon-18 . All nucleotides occur in even no. No. of U is half of u in non-polar AA. no. of C same as in non-polar AA.

* codons with grey background are frequently used codons for respective amino acids.

Non-Polar Amino Acids [32 codons, cg=52;au=44,ag=40,cu=56; total=96][c=24,g=28,a=12,u=32][a absent in middle]

In 1st & 2nd position combination of :-- pyrimidine-pyrimidine-12, purine-purine-4, mixed-16; non-repeating codon-10, repeating codon-22;  no. of 'c' twice that of 'a'. All nucleotides occur in even no.

code number codon AA no. of zeros no.of 1 code number codon AA no. of zeros no. of 1
Thr(T)-aaRSIIa Pro(P)-aaRSIIa
   
   
   
Ser(S)-aaRSIIa Ala(A)-aaRSIIc
   
   
   
Arg(+ve)-aaRSIc Gly(G)-aaRSIIc
   
   
   
(R)Arg(+ve) Val(V)-aaRSIa
   
Ser  
   
Gln(Q)-aaRSIc Leu-aaRSIa
   
Asn(N)-aaRSIIb  
   
(D)Asp(-ve)-aaRSIIb Cys(C)-aaRSIa
   
(E)Glu(-ve)-aaRSIc Trp(W)-aaRSIb
   
Tyr(Y)-aaRSIb Phe(F)-aaRSIIc
   
STOP-aaRSIb Ile(I)-aaRSIa
   
(K)Lys(+ve)-aaRSIIb Met(M)-aaRSIa
   
His(H)-aaRSIIa Leu(L)-aaRSIa
   
           
AA pure repeating codons pure non-repeating codons mixed codons     AA pure repeating codons pure non-repeating codons mixed codons    
  03 01 06       03 nil 07    
name Lys,Glu,Asn Asp His,Tyr, Gln,Ser,Arg,Thr       Phe,Gly,Pro   Leu,Met,Ile, Trp,Cys,Val, Ala    
                       
 
 

*codons under grey background are optimal codons for corresponding AA. optimal codons mentioned for Ser,Thr, Arg,Ala,Val,Gly to be rechecked.

* aug is the initiating codon , but in some prokaryotes, gug is the initiating codon.

* It is observed that optimal codons normally have a or c in their 3rd position.

*codons under violet background are optimal codons if no. of codons for each of them is greater than 1.

 

* some AA exhibit 2 fold, some 3 fold and 4 fold, some 6 fold degeneracy w.r.t codons. All the codons coding for same AA are called synonymous codons. Initially, this "degeneracy" seemed a bit wasteful, but evidence has been accumulating that the synonymous codons actually do behave differently, leading to different functional outcomes (see ENV's coverage, here and here). The first clues came from observations that different synonymous codons affected the rate of translation in the ribosome. Apparently there is an "optimal" codon that translates quickly, while others cause a bit of delay. Like many written languages, the genetic code is filled with synonyms: differently spelled "words" that have the same or very similar meanings. For a long time, biologists thought that these synonyms, called synonymous codons, were in fact interchangeable. Recently, they have realized that this is not the case and that differences in synonymous codon usage have a significant impact on cellular processes, so scientists have advanced a wide variety of ideas about the role that these variations play.

Under polar aa,

(ser,thr,his)-aaRSIIa (asp,asn,lys)-aaRSIIb (glu,gln,arg)-aaRSIc (tyr)-aaRSIb

Under nonpolar aa,

 (leu,ile,val,met,cys)-aaRSIa                  (ala,gly,phe)-aaRSIIc (trp)-aaRSIb              (pro)-aaRSIIa

The aa (trp,pro) are slightly polar in nature and hence under the category of aaRS of polar aa. But in our categorization above, we have put them under nonpolar aa.   *Anagram of codons in Polar Segment also remain in polar and anagram of non-polar segment remain in non-polar.

Exceptions:4 codons(ucg-gcu,ser-ala),(cgu-ugc,Arg-cys), (agu-uga,ser-Trp)(gca-acg, Ala-Thr) in non repeating codons & 2 codons of Pro,Gly each (cca-acc,Pro-Thr),(ccu-ucc,Pro-Ser)-(ggc-cgg,Gly-Arg)(gga-agg, Gly-Arg) in repeating codons

 

Anti-Codon Code table as per polar/non-polar Amino Acids

POLAR AMINO ACIDS [32 anti-codons, cg=44; au=52,ag=40,cu=56 total =96] [c=20,g=24,a=16,u=36][a absent in middle]

In 1st & 2nd position combination of :-- pyrimidine-pyrimidine-12, purine-purine-04, mixed-16. Non repeating codon-14 &  repeating codon-18 . All nucleotides occur in even no. No. of A is half of u in non-polar AA. no. of G same as in non-polar AA.

* codons with grey background are frequently used codons for respective amino acids.

Non-Polar Amino Acids [32 codons, cg=52;au=44,ag=40,cu=56; total=96][c=28,g=24,a=32,u=12][u absent in middle]

In 1st & 2nd position combination of :-- pyrimidine-pyrimidine-04, purine-purine-12, mixed-16; non-repeating codon-10, repeating codon-22;  no. of 'u' half that of 'g'. All nucleotides occur in even no.

code number anti-codon AA code number anti-codon AA
  -(agu) uga Thr(T)-aaRSIIa   -(agg) gga Pro(P)-aaRSIIa
  -(ggu) ugg     -(ggg) ggg  
  -(ugu) ugu     -(ugg) ggu  
- -(cgu) ugc   - -(cgg) ggc  
  -(aga) aga Ser(S)-aaRSIIa   -(agc) cga Ala(A)-aaRSIIc
  -(gga) agg     -(ggc) cgg  
  -(uga) agu     -(ugc) cgu  
- -(cga) agc   - -(cgc) cgc  
  -(acg) gca Arg(+ve)-aaRSIc   -(acc) cca Gly(G)-aaRSIIc
  -(gcg) gcg     -(gcc) ccg  
  -(ucg) gcu     -(ucc) ccu  
- -(ccg) gcc   - -(ccc) ccc  
  -(ucu) ucu (R)Arg(+ve)   -(aac) caa Val(V)-aaRSIa
- -(ccu) ucc     -(gac) cag  
  -(acu) uca Ser   -(uac) cau  
- -(gcu) ucg   - -(cac) cac  
  -(uug) guu Gln(Q)-aaRSIc    -(aag) gaa Leu-aaRSIa
- -(cug) guc     -(gag) gag  
  -(auu) uua Asn(N)-aaRSIIb   -(uag) gau  
- -(guu) uug   - -(cag) gac  
  -(auc) cua (D)Asp(-ve)-aaRSIIb   -(aca) aca Cys(C)-aaRSIa
- -(guc) cug   - -(gca) acg  
  -(uuc) cuu (E)Glu(-ve)-aaRSIc   -(uca) acu Trp(W)-aaRSIb
- -(cuc) cuc   - -(cca) acc  
  -(aua) aua Tyr(Y)-aaRSIb   -(aaa) aaa Phe(F)-aaRSIIc
  -(gua) aug   - -(gaa) aag  
  -(uua) auu STOP-aaRSIb   -(aau) uaa Ile(I)-aaRSIa
- -(cua) auc   - -(gau) uag  
  -(uuu) uuu (K)Lys(+ve)-aaRSIIb   -(uau) uau Met(M)-aaRSIa
- -(cuu) uuc   - -(cau) uac  
  -(aug) gua His(H)-aaRSIIa   -(uaa) aau Leu(L)-aaRSIa
- -(gug) gug   - -(caa) aac  
  -       -    
AA pure repeating codons pure non-repeating codons mixed codons AA pure repeating codons pure non-repeating codons mixed codons
  03 01 06   03 nil 07
name Lys,Glu,Asn Asp His,Tyr, Gln,Ser,Arg,Thr   Phe,Gly,Pro   Leu,Met,Ile, Trp,Cys,Val, Ala
               
 
   

 

 

 

Under polar aa,

(ser,thr,his)-aaRSIIa (asp,asn,lys)-aaRSIIb (glu,gln,arg)-aaRSIc (tyr)-aaRSIb

Under nonpolar aa,

 (leu,ile,val,met,cys)-aaRSIa                  (ala,gly,phe)-aaRSIIc (trp)-aaRSIb              (pro)-aaRSIIa

  *Anagram of codons in Polar Segment also remain in polar and anagram of non-polar segment remain in non-polar.

Exceptions:4 codons(ucg-gcu,ser-ala),(cgu-ugc,Arg-cys), (agu-uga,ser-Trp)(gca-acg, Ala-Thr) in non repeating codons & 2 codons of Pro,Gly each (cca-acc,Pro-Thr),(ccu-ucc,Pro-Ser)-(ggc-cgg,Gly-Arg)(gga-agg, Gly-Arg) in repeating codons

 

 

Code table as per Essential / Non-essential Amino Acids

Non-Essential AA Essential AA
 
* all non-essential amino acids are glucogenic (except Tyr which is both glucogenic & ketogenic). Reverse is not necessarily true. * Young chicks require Gly but adult can synthesize it.

* a=23 u=19, c=27, g=27 , total=96 ; au=42,cg=54, ag=50, cu=46 ; Repeating codons-20, non-repeating codons-12, total no. of codons=32

* 7 pairs of codons anagram to each other:( ucc,ucg,caa,cag,aau,gau,ccg & ccu,gcu,aac,gac,uaa,uag,gcc); 7 codons who are their own anagrams(ucu,gag,uau,ccc,gcg,ggg,ugu); 11 have anagrams on opposite side.

* u - absent in middle position. (even if one adds Arg,His). No. of C=No. of G;

* All nucleotides occur in odd no. (even if one adds Arg, His).

* With A=01,C=10,G=00,U/T=11,Sum of Non-essential AA = 932 & sum of essential AA=1084, LHS-RHS=-76. If Histidine is brought to Non-essential AA & ggg, gga of Glycin taken to Essential, Codon no. remains unaltered and difference becomes zero.

* For non-polar, non-essential AA, a is absent in 1st position.

* Lys, Leu (red color) are purely ketogenic. Phe, Trp,Ile (pink color) are both ketogenic & glucogenic. Rest are purely glucogenic. * Arg,His are partially essential in rats. * In plants, Met, Lys are produced in very small quantities. * His essential for infants. Adult body can produce Histidine.

* a=25, u=29,c=21; g=21, total=96; au=54,cg=42, ag=46,cu=50;Repeating codons-20, non-repeating codons-12, Total no. of codons = 32

* 6 pairs of codons anagram to each other:( gua,guu,guc,cuu,cua,uua& aug,uug,cug,uuc,auc,auu); 9 codons who are their own anagrams(aca,cgc,aga,aaa,cac,gug,cuc,uuu,aua); 11 have anagrams on opposite side.

* In non-polar AA, entire set of 16 codons with 'u' in middle position are there out of total of 18 codons barring Trp.

* All nucleotides occur in odd no. (even if one subtracts Arg,His).

* For polar, essential AA, u/g  absent in 1st position.

               
AA P/NP codon No. AA P/NP codon No.
Ser P -aaRS II a Thr P-aaRS IIa
  P   P
  P   P
  P   P
  P      
  P Arg P-aaRS Ic
        P
Gln P-aaRS Ic   P
  P   P
        P
Asn P-aaRS IIb   P
  P      
      Lys P-aaRS IIb
Asp P-aaRS IIb   P
  P      
      His P-aaRS IIa
Glu P-aaRS Ic   P
  P      
             
Tyr P-aaRS Ib        
  P        
             
STOP          
           
             
a=20, u=13, c=11, g=10 , total=54 ; au=33,cg=21, ag=30, cu=24;

repeating codons-9, non-repeating codons-9

a=16, u=3, c=13, g=10 , total=42 ; au=19,cg=23, ag=26, cu=16 ;

repeating codons-9, non-repeating codons-5 ;

Pro NP-aaRS IIa Val NP-aaRS Ia
  NP   NP
  NP   NP
  NP   NP
           
Ala NP-aaRS IIc Leu NP-aaRS Ia
  NP   NP
  NP   NP
  NP   NP
        NP
Gly NP-aaRS IIc   NP
  NP      
  NP Trp NP-aaRS Ib
  NP   NP
           
Cys NP-aaRS Ia Phe NP-aaRS IIc
  NP   NP
           
a=3, u=6, c=16, g=17 , total=42 ; au=9,cg=33, ag=20, cu=23 ; Ile NP-aaRS Ia
repeating codons-11, non-repeating codons-3 ;     NP
             
        Met NP-aaRS Ia
          NP
              <---total if His is added & gga,ggg of Gly is taken to other side (total)      
<---Difference     a=9, u=26, c=8, g=11 , total=54 ; au=35,cg=19, ag=20, cu=34 ;

repeating codons-11, non-repeating codons-7

    Difference <total if His is minus & gga,ggg of Gly is plus

<--- Difference

Difference              (total)
               
               
               

Code table as per 4-fold ( 8 slots in code table) / 2-fold degeneracy(16 slots in code table) of  Amino Acids

Sl. no. Amino Acid No. Sl. No. Amino Acid No.
1 Leu 1 Phe
2 Val 2 Leu
3 Ser 3 Ile
4 Pro 4 Met
5 Thr 5 Tyr
6 Ala 6 STOP
7 Arg 7 His
8 Gly 8 Gln
      9 Asn
      10 Lys
      11 Asp
      12 Glu
      13 Cys
      14 Trp
      15 Arg
      16 Ser
       
           
           

Code Table as aaRS-I & aaRS-II linkage of Amino Acids

  aaRS-I     aaRS-II  
Sl. No. Amino Acid No. Sl. No. Amino Acid No.
1 Leu(L) -NP 1 Ser(S)-P
2 Val(V) -NP 2 Pro(P)-NP
3 Ile(I)    -NP 3 Thr(T)-P
4 Cys(C) -NP 4 His(H)-P
5 Met(M) -NP      
aaRS-Ia (16 codons)   aaRS-IIa (16 codons)  
6 Tyr(Y) -P 5 Lys(K)-P
7 Trp(W)-NP 6 Asp(D)-P
11 STOP 7 Asn(N)-P
aaRS-Ib (6 codons)   aaRS-IIb (6 codons)  
8 Arg(R) -P 8 Gly(G)-NP
9 Glu(E) -P 9 Ala(A)-NP
10 Gln(Q) -P 10 Phe(F)-NP
aaRS-Ic (10 codons)   aaRS-IIc (10 codons)  
       
* For c=11,a=00,u=01,g=10, aaRS-I =1012, aaRSII=1004;If 2 codons with difference 4 are interchanged, both sides become 1008. Possibilities 17-13,16-12,19-15,18-14,32-28,34-30,48-44,50-46,37-33,39-35,36-32,38-34,4-0,6-2,53-49,55-51,5-1,7-3. For exa, if it is 5-1=auu-aau.
           

 

Genetic Code table

    u(1)       c(2)       a(3)       g(4)            
  AA codon no. remark AA codon no. remark AA codon no. remark AA codon no. remark   no.    
u(1) Phe uuu   ser ucu   tyr uau   cys ugu   u    
    uuc   ser ucc   tyr uac   cys ugc   c    
  Leu uua   ser uca   stop uaa   trp/stop uga   a    
    uug   ser ucg   stop uag   trp ugg   g    
                               
c(2) Leu cuu   pro ccu   his cau   arg cgu   u    
  Leu cuc   pro ccc   his cac   arg cgc   c    
  Leu cua   pro cca   gln caa   arg cga   a    
  Leu cug   pro ccg   gln cag   arg cgg   g    
                               
a(3) Ile auu   thr acu   asn aau   ser agu   u    
  Ile auc   thr acc   asn aac   ser agc   c    
  met/Ile aua   thr aca   lys aaa   arg aga   a    
  met aug   thr acg   lys aag   arg agg   g    
                               
g(4) val guu   ala gcu   asp gau   gly ggu   u    
  val guc   ala gcc   asp gac   gly ggc   c    
  val gua   ala gca   glu gaa   gly gga   a    
  val gug   ala gcg   glu gag   gly ggg   g    
u=24 c=8,a=8 g=8   u=8,c=24 a=8,g=8   u=8,c=8 a=24,g=8   u=8.c=8 a=8,g=24        
cg/au=1/2       cg/au=2/1     cg/au=1/2     cg/au=2/1          
                                         
Base

LOGIC GATES

Logic gates are elementary building blocks of a digital circuit. Most logic gates have 2 inputs & 1 output.
Each of 2 inputs is one of the two binary conditions 0 & 1. 0 ---> OFF ; 1 ---> ON
name of the gate indicative no.
AND 1
OR 2
XOR 3
NAND 4
NOR 5
XNOR 6
   

Genetic database: http://www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html

                            http://www.ncbi.nlm.nih.gov/books/NBK44863/

                            http://compbio.cs.princeton.edu/ancestralaa    http://blast.ncbi.nlm.nih.gov/Blast.cgi

                            http://www.genome.jp/aaindex/            http://www.incogen.com/bioinfo_tutorials/Bioinfo-Lecture_3-pairwise-align2.html

                           http://www.whatabeginning.com/Misc/Genetics/Genetics_VS.htm

                           http://www.genome.jp/tools/clustalw/

                           http://www.cytoscape.org/

                           Here is a DNA sequence that I am going to use for testing the hypothesis. The complete human genome can be found online here -

                           ftp://ftp.ensembl.org/pub/release-67...o_sapiens/dna/

                           Here is the NCBI index page for complete genomes - http://www.ncbi.nlm.nih.gov/genome

                           As you can see, I can choose any of the following - Human, microbes, organelles, plants, viruses.

Here is the complete sequence for Chromosome 1 of human DNA

http://www.ncbi.nlm.nih.gov/nuccore/...0?report=fasta

Chromosome 1 is 249,250,621 bases long.

10 million bytes = 6 million bases approximately, so the complete chromosome will have a download size of about 400 megabytes, which is quite big.

I have searched the first 6.5 million bases of Chromosome 1 for any pattern of amino acids that would match the Genesis1 sequence of letters.

There is no match between amino acids sequence and Genesis1 letter sequence within the part of Chromosome 1 searched.

Download

There are two software programs. The first is for non-overlapping codons. The second is for overlapping codons -

A. http://www.craigdemo.co.uk/DNAsetup.zip

B. http://www.craigdemo.co.uk/OverlappingCodonSetup.zip


Extract from zip then in the dialog that appears click NEXT - EVERYONE - NEXT - NEXT.

Note : If you do not have the dotNet 4.0 framework installed on your computer already, you may need to install that also. 
http://www.webqc.org/aminoacids.php
 

codon : --     binary : --         * indicates no GATE
  binary code gate output NOT GATE output
Base-1 nucleotide
Base-2 nucleotide
Base-3 nucleotide
Decimal sum    
LHS - add 1 & find sum    
RHS - add 1 & find sum    
Base1*+Base2gate+Base3gate      
Base1gate+Base2*+Base3gate      
Base1gate+Base2gate+Base3*      
Base1*+Base2*+Base3gate      
Base1*+Base2gate+Base3*      
Base1gate+Base2*+Base3*