Under the guidance of Jim, I created an Xfam class in the jalview.ws package, which the Rfam and Pfam classes extend. ( Sidenote, there is an Xfam blog about new developments of the Rfam and Pfam databases. ) Then I added RfamSeed and RfamFull classes, similar to the ones for Pfam. These contain the methods to fetch sequences from the "seed" and "full" alignments available on Rfam in Stockholm format, respectively. I had to modify the names of some methods to keep things consistent. In SequenceFetcher.java (in package jalview.ws), I called addDBRefSourceImpl() for RfamSeed and RfamFull to enable calling Rfam sequence retrieval from the Jalview menu.
To fetch sequences from Rfam (and Pfam), Jalview accesses stable urls that can be used to get the alignments. In the RfamSeed and RfamFull classes, part of the url is hardcoded in with the correct variables for the query string (I learned this while I was creating the classes, see the wikipedia page on CGI and QUERY_STRING.)
The variables for the Rfam website:
- 'acc': followed by "=" and the accession number will give you the corresponding familiy.
- 'id': followed by "=" and the ID name will give you the corresponding familiy.
- 'alnType': alignment type, can be 'seed' or 'full'
- 'nseLabels': toggle for species names, can be 0 or 1
- 'format': file format, can be 'stockholm', 'pfam', 'fasta' or 'fastau'.
Rfam has two mirrors, one at the Sanger Institute and one at Janelia farm. The Janelia farm mirror has not yet been updated to Rfam 10.0, so I used the Sanger Institute url.
There's one bug and I think it might be in the Stockholm parser. If you go to View > Alignment properties... no dialog window pops up with the alignment properties.
No comments:
Post a Comment