Search This Blog

Thursday, June 3, 2010

How RALEE parses the secondary structure line

I've figured out how RALEE interprets the secondary structure line and figures out helices…I really don't like Lisp, haha. The GNU Emacs Lisp Reference Manual isn't too bad: http://www.gnu.org/software/emacs/manual/html_mono/elisp.html

Basically, this is what happens, in two separate functions.

Parse the secondary structure line from the Stockholm file (ralee-get-base-pairs)

Description: Keep a list of the positions of the open brackets and a list whose elements are paired positions. When you reach a close bracket, pair it with the last open bracket and store this pair in the "pairs" list.

  • Split the line into a "position" list (each character of the line is an element of the list)

  • Keep a "stack" list and a "pairs" list

  • Go through each element of the "position" list
    *if the base is an open bracket type, add the current position as the first element of the "stack" list

    *if the base is a close bracket type, create a 2 element list that consists of the first element of "stack" list and the current position. Add this list as the first element of the "pairs" list and remove the first element from the stack list.

Numbers indicate the order in which the pairs are added to the "pairs" list

<<..<<..>>..<<..>>>>
65..21..12..43..3456

Create a Hash of positions and which helix they belong to (ralee-helix-map)
Description: This function takes the pairs list generated by the above function as its input. It goes through the pairs and figures out how many helices there are and creates a hash where the keys are positions and the values are which helix they belong to.

Variables:
helix = the current helix, an int
helices = hash of the positions and which helix they belong to
lastclose = position of the last close bracket reviewed
lastopen = position of the last open bracket reviewed
open = position of current open bracket
close = position of current close bracket
pair = current pair (consists of open and close)
pairs = list of base pairs (input, elements are 2 element lists)
i = current pair (in pairs list)

  • Go through each item in pairs list

    *for the current pair, open is the first element and close is the second element
    *Check for an inner helix:
     catch things like
    ; <<..>>..<<..>>
    ; *
    if the lastclose comes before open, increment the current helix
    *Check for bulges:
     catch things like
    ; <<..<<..>>..<<..>>>>
    ; *

    <<..<<..>>..<<..>>>>
    cc..aa..aa..bb..bbcc

    compare current pair to all other pairs (this works because of the way that the pairs are stored)
    **if open of a pair comes before lastopen and also after current open
    **then find which helix the open belongs to. If it belongs to the current helix, do nothing. Otherwise increment the number of helices.

  • add the current open and close to the helices hash with the helix as the value

  • set lastopen and lastclose to open and close

No comments:

Post a Comment