Recycling Corner/Alpha Helix Generator

From Jmol
Jump to navigation Jump to search

RIBOZOME - a Polypeptide Alpha Helix Generator script

Creates an alpha helix from a user input string.

Jmol script by Ron Mignery with help from Angel Herráez

Script for the Jmol application

The top-level function plico_gen_pp asks user for a peptide sequence in a pop-up.

The top-level function plico_gen_alpha accepts a sequence string as a parameter.

With this script you can generate a polypeptide alpha helix chain from a sequence string in 1-char amino-acid coding. You can optionally give it a chain label other than the default :A or start at a residue number other than the default 1. You can also add to an existing chain, add new chains to an existing model or now create the chain in a new frame by prepending a '+' to the sequence string.

Chains start from the origin and extend along the Z-axis as they are built. If a chain with the same chain label as the new chain is at the origin, the new sequence is added to the old. If a chain with a different chain label is at the origin, all existing chains are shifted 20 angstroms to the right along the X axis until the origin is clear.

Ribozome is a member of the Plico suite of protein folding tools described here. It may be installed and accessed as a macro with the file:

Title=PLICO Generate Polypeptide
Script=script <path to your script folder>/ribozome.spt;plico_gen_pp

saved as ribozome.macro in your .jmol/macros directory as described in Macro.

Copy and paste the following into a text editor and save in your scripts directory as ribozome.spt.

#   RIBOZOME - Jmol script by Ron Mignery
#   v1.19 beta    4/12/2016 -axis is now a reserved word
#
#   RIBOZOME takes a string message encoding an amino acid (aa) sequence
#   and generates a corresponding alpha helix one aa at a time from the
#   N terminus to the C terminus rotating the emerging helix as it goes.
#
#   The message is a string entered by the user at a prompt. <+A:><n...>[A-Z]...
#   It may be typed in or pasted in and be of any length
#   If the message is prepended with "+" then the chain is created in a new
#   frame. Otherwise it is added to the current frame
#   If the message is then prepended with <C>: (where C is any single letter)
#   then the chain is so labeled and separated from existing chains
#   if different from the first chain.
#   If the message is then prepended with digits (-0-9), that number becomes
#   the first resno (even if negative).  Otherwise the first is 1.
#
#   The IUPAC/IUBMB 1 letter code is used:
#   A=ALAnine B=GLutam?X* C=CYSteine D=ASPartate E=GLUtamate
#   F=PHEnylalanine G=GLYcine H=HIStidine I=IsoLEucine K=LYSine
#   L=LEUcine M=METhionine N=ASparagiNe O=PYrroLysine*** P=PROline
#   Q=GLutamiNe R=ARGinine S=SERine T=THReonine U=SElenoCysteine
#   V=VALine W=TRyPtophan X=UNKnown Y=TYRosine Z=ASpar?X**
#     *Either GLU or GLN: drawn as GLN with chi3 flipped
#    **Either ASP or ASN: drawn as ASN with chi3 flipped
#   ***Not supported: drawn as ALA

# The following constant values determine the pitch of the alpha helix
kPHI = -64    # Dihedral angle of N-CA bond (nominally -64)
kPSI = -39    # Dihedral angle of CA-C bond (nominally -39)
kOMEGA = 180    # Dihedral angle of the peptide bond (nominally 180)
kPEPTIDE_ANGLE = 120    # C-N-CA angle (nominally 120)
kPRO_BUMP = -10 # Psi angle change increment to N-3psi when N is Pro
gCHAIN = 'A'    # The chain id
gA = ""
gPdbAddHydrogens = false
gAppendNew = true
gNewFrame = false
gSeq = ""

# Lookup 3 letter code from 1 letter code
g3from1 = {"A":"ALA", "B":"GLX","C":"CYS", "D":"ASP","E":"GLU", "F":"PHE",
    "G":"GLY", "H":"HIS","I":"ILE", "K":"LYS","L":"LEU", "M":"MET",
    "N":"ASN", "O":"PYL","P":"PRO", "Q":"GLN","R":"ARG", "S":"SER",
    "T":"THR", "U":"SEC","V":"VAL", "W":"TRP","X":"UNK", "Y":"TYR", "Z":"ASX"}

# Generate PDB atom record
function gen_atom(e, aa, i, xyz) {
    gA =  format("ATOM  %5d %4s %3s ", gN, e, aa )
    gA +=  format("%s%4d    %8.3f", gCHAIN, i, xyz[1] )
    gA +=  format("%8.3f%8.3f\n", xyz[2], xyz[3] )
    gN++
    return gA
};

# Generate a PDB amino acid record set
function gen_aa(i, aa) {    # Writes globals gA and gN

    # From constructed AAs
    var N0 = [0.0, 0.0, 0.0]
    var CA = [ 0.200, 1.174, 0.911 ]
    var C  = [ -1.110, 1.668, 1.425 ]
    var O  = [ -1.320, 1.693, 2.62 ]
    var CB = [ 1.062, 2.1950, 0.230 ]

    var G1 = [ 2.359, 1.607, -0.344]
    var G2 = [ 1.363, 3.336, 1.157 ]
    var D1 = [ 3.222, 2.656, -1.048 ]
    var D2 = [ 3.143, 0.904, 0.725 ]
    var E1 = [ 3.645, 3.749, -0.167 ]
    var E2 = [ 2.491, 3.234, -2.249 ]
    var Z  = [ 4.474, 4.857, -0.565 ]
    var H1 = [ 4.090, 6.173, -0.166 ]
    var H2 = [ 5.565, 4.707, -1.229 ]

    var D1dn = [ 2.955, 2.229, -1.250 ]
    var D2dn = [ 2.859, 0.552, 0.102 ]

    var E1eq = [ 3.821, 3.528, -0.382 ]
    var E2eq = [ 3.337, 2.634, -2.293 ]

    var Gp = [ 2.008, 1.24, -0.46 ]
    var Dp = [ 1.022, 0.213, -1.031 ]

    var Gfy  = [ 2.368, 1.471, -0.0152 ]
    var D1fy = [ 3.346, 1.524, 0.921 ]
    var D2fy = [ 2.493, 0.516, -1.151 ]
    var E1fy = [ 4.513, 0.615, 0.8244 ]
    var E2fy = [ 3.528, -0.336, -1.206 ]
    var Zfy  = [ 4.588, -0.285, -0.168 ]
    var Hfy = [ 5.738, -1.245, -0.233 ]

    var Ghw  = [ 2.406, 1.626, -0.134 ]
    var D1hw = [3.498, 1.936, 0.675]
    var D2hw = [ 2.713, 0.901, -1.211 ]
    var E1hw = [ 4.160, 0.518, -1.178 ]
    var E2hw = [ 4.622, 1.160, 0.0816 ]
    var E3hw = [ 3.789, 2.523, 1.944 ]
    var Z2hw = [ 5.973, 1.177, 0.689 ]
    var Z3hw = [ 5.014, 2.550, 2.503 ]
    var H2hw = [ 6.153, 1.846, 1.844 ]

    #N1 = [ 2.069, -2.122, -0.554]

    # Build PDB atom records common to all AAs
    var a3 = g3from1[aa]
    if (a3 = "") {
        a3 = "UNK"
    }
    print format("+ %s%d/%d", a3, i, gSeq.count + gResno)
    gA = gen_atom(" N  ", a3, i, N0)
    gA += gen_atom(" CA ", a3, i, CA)
    gA += gen_atom(" C  ", a3, i, C)
    gA += gen_atom(" O  ", a3, i, O)
    if ((aa != 'G') && (aa != 'X')) {
        gA += gen_atom(" CB ", a3, i, CB)
    }

    # Now add AA specific atom records
    switch (aa) {
    case 'A' :
        break;
    case 'B' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" CD ", a3, i, D1)
        gA += gen_atom(" OE1", a3, i, E2eq)    # GLN with Es switched
        gA += gen_atom(" NE2", a3, i, E1eq)
        break;
    case 'C' :
        gA += gen_atom(" SG ", a3, i, G2)
        break;
    case 'D' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" OD1", a3, i, D1dn)
        gA += gen_atom(" OD2", a3, i, D2dn)
        break;
    case 'E' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" CD ", a3, i, D1)
        gA += gen_atom(" OE1", a3, i, E1eq)
        gA += gen_atom(" OE2", a3, i, E2eq)
        break;
    case 'F' :
        gA += gen_atom(" CG ", a3, i, Gfy)
        gA += gen_atom(" CD1", a3, i, D1fy)
        gA += gen_atom(" CD2", a3, i, D2fy)
        gA += gen_atom(" CE1", a3, i, E1fy)
        gA += gen_atom(" CE2", a3, i, E2fy)
        gA += gen_atom(" CZ ", a3, i, Zfy)
        break;
    case 'G' :
        break;
    case 'H' :
        gA += gen_atom(" CG ", a3, i, Ghw)
        gA += gen_atom(" ND1", a3, i, D1hw)
        gA += gen_atom(" CD2", a3, i, D2hw)
        gA += gen_atom(" CE1", a3, i, E2hw)
        gA += gen_atom(" NE2", a3, i, E1hw)
        break;
    case 'I' :
        gA += gen_atom(" CG1", a3, i, G1)
        gA += gen_atom(" CG2", a3, i, G2)
        gA += gen_atom(" CD1", a3, i, D1)
        break;
    case 'K' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" CD ", a3, i, D1)
        gA += gen_atom(" CE ", a3, i, E1)
        gA += gen_atom(" NZ ", a3, i, Z)
        break;
    case 'L' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" CD1", a3, i, D1)
        gA += gen_atom(" CD2", a3, i, D2)
        break;
    case 'M' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" SD ", a3, i, D1)
        gA += gen_atom(" CE ", a3, i, E1)
        break;
    case 'N' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" OD1", a3, i, D1dn)
        gA += gen_atom(" ND2", a3, i, D2dn)
        break;
    case 'P' :
        gA += gen_atom(" CG ", a3, i, GP)
        gA += gen_atom(" CD ", a3, i, DP)
        break;
    case 'Q' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" CD ", a3, i, D1)
        gA += gen_atom(" OE1", a3, i, E1eq)
        gA += gen_atom(" NE2", a3, i, E2eq)
        break;
    case 'R' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" CD ", a3, i, D1)
        gA += gen_atom(" NE ", a3, i, E1)
        gA += gen_atom(" CZ ", a3, i, Z)
        gA += gen_atom(" NH1", a3, i, H1)
        gA += gen_atom(" NH2", a3, i, H2)
        break;
    case 'S' :
        gA += gen_atom(" OG ", a3, i, G1)
        break;
    case 'T' :
        gA += gen_atom(" OG1", a3, i, G1)
        gA += gen_atom(" CG2", a3, i, G2)
        break;
    case 'U' :
        gA += gen_atom("SeG ", a3, i, G1)
        break;
    case 'V' :
        gA += gen_atom(" CG1", a3, i, G1)
        gA += gen_atom(" CG2", a3, i, G2)
        break;
    case 'W' :
        gA += gen_atom(" CG ", a3, i, Ghw)
        gA += gen_atom(" CD2", a3, i, D1hw)
        gA += gen_atom(" CD1", a3, i, D2hw)
        gA += gen_atom(" CE2", a3, i, E2hw)
        gA += gen_atom(" NE1", a3, i, E1hw)
        gA += gen_atom(" CE3", a3, i, E3hw)
        gA += gen_atom(" CZ2", a3, i, Z2hw)
        gA += gen_atom(" CZ3", a3, i, Z3hw)
        gA += gen_atom(" CH2", a3, i, H2hw)
        break;
    case 'X' :
        gA += gen_atom(" Xx ", a3, i, CB)
        break;
    case 'Y' :
        gA += gen_atom(" CG ", a3, i, Gfy)
        gA += gen_atom(" CD1", a3, i, D1fy)
        gA += gen_atom(" CD2", a3, i, D2fy)
        gA += gen_atom(" CE1", a3, i, E1fy)
        gA += gen_atom(" CE2", a3, i, E2fy)
        gA += gen_atom(" CZ ", a3, i, Zfy)
        gA += gen_atom(" OH ", a3, i, Hfy)
        break;
    case 'Z' :
        gA += gen_atom(" CG ", a3, i, G1)
        gA += gen_atom(" OD1", a3, i, D2dn)    # ASN with Ds switched
        gA += gen_atom(" ND2", a3, i, D1dn)
        break;
    default :
        break;
    }

    return gA
};

# Generate an alpha helix
function plico_gen_alpha(seq) {
    try {
        gPdbAddHydrogens = pdbAddHydrogens
        set pdbAddHydrogens false
        if (gPlicoRecord != "") {
            var g = format("show file \"%s\"", gPlicoRecord)
            var ls = script(g)
            if (ls.find("FileNotFoundException")) {
                ls = ""
            }
            ls += format("plico_gen_alpha(\"%s\");", seq)
            write var ls @gPlicoRecord
        }
    
        seq = seq%9999%0
        gNewFrame = false
        if (seq[1] == '+') {
            gNewFrame = true
            seq = seq[2][9999]
        }
        if (seq[2] == ':') {
            gCHAIN = seq[1]
            seq = seq[3][9999]
        }
        var resStart = ""
        while ("_0123456789-".find(seq[1]) >  0) {
            resStart += ""+seq[1]
            seq = seq[2][9999]
        }
        gSeq = seq
    
        var c = gCHAIN
        if (resStart.size > 0) {
            gResno = -1 + resStart    # global pre-existing AA count
        }
        else {
            gResno = 0    # global pre-existing AA count
        }
        var pn = 1    # previous gN
        gAppendNew = appendNew
        gN = 1
        if (gNewFrame) {
            appendNew = true
        }
        else {
            appendNew = false
    
            # If not new
            if (m > 0) {
    
                # Get the largest atomno in frame
                gN = {thisModel}.atomno.max
    
                # While there is an atom at the origin
                while (true) {
                    var ai = {within(0.1, {0 0 0}) and thisModel}
                    if (ai) {
    
                        # If on the same chain
                        if (ai[1].chain = c) {
    
                            # Delete OXT
                            delete {(atomName="OXT") and thisModel
                                and (chain=c)}
                            gResno = ai.resno
                            pn = ai.atomno
                            break
                        }
                        # Else move all from the new chain rightward on X axis
                        else {
                            select {thisModel and (chain!=c)}
                            translateselected {20, 0, 0 }
                            gN++
                        }
                    }
                    else {
                        break
                    }
                } # endwhile
            }
        }
    
        # For each aa
        var nn = gN    # new N
        for (var i = 1; i <= gSeq.count; i++) {
    
            if ((m > 0) and not appendNew) {
    
                # Move polypeptide C to bond distance from new AA N
                select {thisModel and (chain=c)}
                fix none
                translateselected {2.069, -2.122, -0.554 } #N1
            }
    
            # Gen AA
            gA = "data \"append aa\"\n" # global PDB atom record
            gA += gen_aa(i + gResno, gSeq[i]) # gN is updated in subroutine
            gA += "end \"append aa\""
            script inline @{gA}
    
            var f = (_frameID/1000000)
            var m = (_frameID%1000000)
            appendNew = false
    
            # If PRO ahead
            var pb = 0
            if ((gSeq.count - i) >= 2) {
                if (gSeq[i + 2] == 'P') {
                    pb = kPRO_BUMP
                }
            }
    
            # If not first AA
            if (nn > 1) {
    
                # Gen axis on new N perpendicular to the plane
                # containing atoms nn, pn+2 and nn+1
                var aNn = {(atomno=nn) and thisModel and (chain=c)}
                var aNn1 = {(atomno=@{nn+1}) and thisModel and (chain=c)}
                var aPn1 = {(atomno=@{pn+1}) and thisModel and (chain=c)}
                var aPn2 = {(atomno=@{pn+2}) and thisModel and (chain=c)}
                var aPn3 = {(atomno=@{pn+3}) and thisModel and (chain=c)}
                var v1=aPn2.xyz - aNn.xyz
                var v2=aNn1.xyz - aNn.xyz
                var caxis = cross(v1, v2)
    
                # Center on atom previous C
                caxis += aPn2.xyz
    
                # Rotate the polypeptide N on the new AA C to tetrahedral (nominally 110)
                select {(atomno < nn) and thisModel and (chain=c)}
                fix {(atomno >= nn) and thisModel and (chain=c)}
                rotateselected @caxis @aNn @{kPEPTIDE_ANGLE - 65.5}
    
                # Make omega dihedral = kOMEGA (nominally 180)
                rotateselected @aPn2 @aNn @{kOMEGA - 157.8}
    
                # Make the new phi dihedral = kPHI (nominally -60)
                rotateselected @aNn @aNn1 @{-kPHI - 116.5}
    
                # Make the old psi dihedral = kPSI (nominally -50)
                select {(atomno < nn) and thisModel and (chain=c)
                    and not aPn2 and not aPn3}
                rotateselected @aPn1 @aPn2 @{-kPSI - 60.6 + pb}
    
                # Make the peptide bond
                connect @aNn @aPn2
                refresh
            }
    
            # Step new and previous N
            pn = nn
            nn = gN
    
        } # endfor i
    
        # Add terminal O
        gA = "data \"append aa\"\n"    # global PDB atom record
        gA += gen_atom(" OXT", g3from1[gSeq[gResno+gSeq.count]],
            gResno+gSeq.count, [ -2.142, 2.057, 0.574])
        gA += "end \"append aa\""
        script inline @{gA}
        refresh
    
        connect
        var xx = {(element="Xx") and thisModel and (chain=c)}
        for (var i = 1; i <= xx.size; i++) {
              connect 1.8 {(atomindex=@{xx[i].atomIndex}) and thisModel and (chain=c)}
        }
    
        # Clean up
        connect ([UNK].CA) ([UNK].Xx and within(group, _1) and thisModel and (chain=c))
        select (thisModel)
        fix none
        print format("%d atoms generated", gN-1)
        appendNew = gAppendNew
        pdbAddHydrogens = gPdbAddHydrogens
    }
    catch {
        print "error"
    }
}

function plico_gen_pp {
    print "Generating Alpha Helix..."

    # Get the sequence from the user
    gSeq = prompt("Enter AA sequence (<+A:><n...>[A-Z]...)", "")%9999%0
    if ((gSeq != "NULL") and (gSeq.count > 0)) {
        print format ("Sequence=%s  phi=%d  psi=%d", gSeq, kPHI, kPSI)
        plico_gen_alpha(gSeq)
    }
}
# end of ribozome.spt

In a webpage

(Updated to RIBOZOME v1.16 script, now works both in Jmol-Java and JSmol-HTML5)

Sequence input is managed from the webpage. In addition, the helix parameters (which are fixed in the original script) may be changed in the page by a user. The model is generated and displayed in a Jmol object within the page (JmolApplet or JSmol HTML5 object).

See it in action

Contributors

Remig, AngelHerraez