How do I make a Sequence from a String or make a Sequence Object back into a String?

A lot of the time we see sequence represented as a String of characters eg "atgccgtggcatcgaggcatatagc". It's a convenient method for viewing and succinctly representing a more complex biological polymer. BioJava makes use of SymbolLists and Sequences to represent these biological polyners as Objects. Sequences extend SymbolLists and provide extra methods to store things like the name of the sequence and any features it might have but you can think of a Sequence as a SymbolList.
Within Sequence and SymbolList the polymer is not stored as a String. BioJava differentiates different polymer residues using Symbol objects that come from different Alphabets. In this way it is easy to tell if a sequence is DNA or RNA or something else and the 'A' symbol from DNA is not equal to the 'A' symbol from RNA. The details of Symbols, SymbolLists and Alphabets are covered here. The crucial part is there needs to be a way for a programmer to convert between the easily readable String and the BioJava Object and the reverse. To do this BioJava has Tokenizers that can read a String of text and parse it into a BioJava Sequence or SymbolList object. In the case of DNA, RNA and Protein you can do this with a single method call. The call is made to a static method from either DNATools, RNATools or ProteinTools.

String to SymbolList
import org.biojava.bio.seq.*;
import org.biojava.bio.symbol.*;

public class StringToSymbolList {
public static void main(String[] args) {

try {
//create a DNA SymbolList from a String
SymbolList dna = DNATools.createDNA("atcggtcggctta");

//create a RNA SymbolList from a String
SymbolList rna = RNATools.createRNA("auugccuacauaggc");

//create a Protein SymbolList from a String
SymbolList aa = ProteinTools.createProtein("AGFAVENDSA");
}
catch (IllegalSymbolException ex) {
//this will happen if you use a character in one of your strings that is
//not an accepted IUB Character for that Symbol.
ex.printStackTrace();
}

}
}

String to Sequence
import org.biojava.bio.seq.*;
import org.biojava.bio.symbol.*;

public class StringToSequence {
public static void main(String[] args) {

try {
//create a DNA sequence with the name dna_1
Sequence dna = DNATools.createDNASequence("atgctg", "dna_1");

//create an RNA sequence with the name rna_1
Sequence rna = RNATools.createRNASequence("augcug", "rna_1");

//create a Protein sequence with the name prot_1
Sequence prot = ProteinTools.createProteinSequence("AFHS", "prot_1");
}
catch (IllegalSymbolException ex) {
//an exception is thrown if you use a non IUB symbol
ex.printStackTrace();
}
}
}

SymbolList to String
You can call the seqString() method on either a SymbolList or a Sequence to get it's Stringified version.
import org.biojava.bio.symbol.*;

public class SymbolListToString {
public static void main(String[] args) {
SymbolList sl = null;
//code here to instantiate sl

//convert sl into a String
String s = sl.seqString();
}
}

Share/Bookmark

No comments:

Post a Comment


Powered by  MyPagerank.Net

LinkWithin