MDA
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Groups
Public Types | List of all members
MDAT::SequenceSetBase< SequenceType, MemoryType > Class Template Reference

The base class for a SequenceSet. More...

#include <SequenceSet_basic.hpp>

Public Types

typedef SequenceType value_type
 The sequence type used in the object.
 

Public Member Functions

Constructors & Destructors
 SequenceSetBase ()
 Constructor.
 
 SequenceSetBase (size_t value)
 
virtual ~SequenceSetBase ()
 Destructor.
 
Basic methods
const SequenceType * seq (unsigned int index) const
 Returns a sequence. More...
 
size_t n_seqs () const
 returns the number of sequences. More...
 
size_t size () const
 returns the number of sequences. More...
 
size_t length () const
 Returns the length of the sequence inside. More...
 
double avg_size () const
 The average size of the sequence set. More...
 
bool empty () const
 Returns true if no sequences are contained in this object.
 
std::string file () const
 Returns the file the sequences were read from.
 
char seq_type () const throw ()
 Returns type. More...
 
void seq_type (char seq_type_) throw ()
 Sets the sequence type. More...
 
int id () const throw ()
 Returns the id of the set. More...
 
void id (int val)
 Sets the id of the sequence set. More...
 
void clear ()
 Sets everything to 0.
 
Input & Output
virtual void read (const std::string &seq_f, const std::vector< std::string > &seq_names, bool check=false, short format=-1)
 Extracts a subalignment from the sequence set. More...
 
virtual void read (const std::string &seq_f, bool check=false, short format=-1)
 Reads a set of sequences. More...
 
virtual void write (const std::string &seq_f, const std::string format) const
 Writes the sequences into a file. More...
 
void add_seq (SequenceType *seq)
 Append a sequence to a set. More...
 
Manipulation methods
void to_upper ()
 Turns all characters to uppercase.
 
void to_lower ()
 Turns all characters to lowercase.
 
virtual void delete_seqs (const std::map< std::string, bool > &names)
 Deletes sequences from the alignment. More...
 
virtual void delete_seqs (std::vector< size_t > &indices)
 Deletes sequences from the alignment. More...
 
virtual void keep_seqs (std::vector< size_t > &indices)
 Deletes sequences if they are not in the given list. More...
 
void share (const SequenceSetBase< SequenceType, MemoryType > &set, size_t id)
 Shares a sequence between two sets. More...
 
void transfer (SequenceSetBase< SequenceType, MemoryType > &set, size_t id)
 Transfer a sequence from one set to another. More...
 
void transfer (SequenceSetBase< SequenceType, MemoryType > &set)
 Transfers all sequences from one set to another. More...
 
void sort (std::string type)
 Sorts the sequences. More...
 
void insert_gaps (const std::string &edit_string)
 Inserts gaps into each sequence. More...
 

Operators

SequenceType & operator[] (unsigned int index)
 Operator to access the sequence. More...
 
const SequenceType & operator[] (unsigned int index) const
 
SequenceType & operator[] (const std::string &seq_name)
 Access a function by name. More...
 
const SequenceType & operator[] (const std::string &seq_name) const
 
template<typename SeqType , typename MemType >
std::ostream & operator<< (std::ostream &out, const SequenceSetBase< SeqType, MemType > &seqSet)
 

Detailed Description

template<typename SequenceType, typename MemoryType>
class MDAT::SequenceSetBase< SequenceType, MemoryType >

The base class for a SequenceSet.

Template Parameters
SequenceTypeThe type of Sequence to be stored inside (Sequence, ProteinSequence or DNASequence).
MemoryTypeThe memory mode to use for the sequence set (Default or MemSafe).

Member Function Documentation

template<typename SequenceType, typename MemoryType>
void MDAT::SequenceSetBase< SequenceType, MemoryType >::add_seq ( SequenceType *  seq)
inline

Append a sequence to a set.

Parameters
seqA pointer to the new sequence.
template<typename SequenceType, typename MemoryType>
double MDAT::SequenceSetBase< SequenceType, MemoryType >::avg_size ( ) const

The average size of the sequence set.

Returns
The average size.
template<typename SequenceType , typename MemoryType >
void MDAT::SequenceSetBase< SequenceType, MemoryType >::delete_seqs ( const std::map< std::string, bool > &  names)
virtual

Deletes sequences from the alignment.

Parameters
namesThe names of the sequences to delete
template<typename SequenceType , typename MemoryType >
void MDAT::SequenceSetBase< SequenceType, MemoryType >::delete_seqs ( std::vector< size_t > &  indices)
virtual

Deletes sequences from the alignment.

Parameters
indicesThe indices of the sequences to delete.
template<typename SequenceType, typename MemoryType>
int MDAT::SequenceSetBase< SequenceType, MemoryType >::id ( ) const throw ()
inline

Returns the id of the set.

Returns
The id
template<typename SequenceType, typename MemoryType>
void MDAT::SequenceSetBase< SequenceType, MemoryType >::id ( int  val)
inline

Sets the id of the sequence set.

Parameters
valThe id
template<typename SequenceType , typename MemoryType >
void MDAT::SequenceSetBase< SequenceType, MemoryType >::insert_gaps ( const std::string &  edit_string)

Inserts gaps into each sequence.

Parameters
edit_stringThe matter of matches and gaps in reverse order
template<typename SequenceType , typename MemoryType >
void MDAT::SequenceSetBase< SequenceType, MemoryType >::keep_seqs ( std::vector< size_t > &  indices)
virtual

Deletes sequences if they are not in the given list.

Parameters
indicesThe indices of the sequences to keep.
template<typename SequenceType, typename MemoryType>
size_t MDAT::SequenceSetBase< SequenceType, MemoryType >::length ( ) const
inline

Returns the length of the sequence inside.

Returns the length of the first sequence or 0 if not existant.

Returns
The length
template<typename SequenceType, typename MemoryType>
size_t MDAT::SequenceSetBase< SequenceType, MemoryType >::n_seqs ( ) const
inline

returns the number of sequences.

Returns
The number of sequences.
template<typename SequenceType, typename MemoryType>
SequenceType& MDAT::SequenceSetBase< SequenceType, MemoryType >::operator[] ( unsigned int  index)
inline

Operator to access the sequence.

Parameters
indexThe sequence position to return.
Returns
Pointer to the sequence.
template<typename SequenceType, typename MemoryType>
const SequenceType& MDAT::SequenceSetBase< SequenceType, MemoryType >::operator[] ( unsigned int  index) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

template<typename SequenceType, typename MemoryType>
SequenceType& MDAT::SequenceSetBase< SequenceType, MemoryType >::operator[] ( const std::string &  seq_name)
inline

Access a function by name.

Parameters
seq_nameThe name of the sequence
Returns
The Sequence
template<typename SequenceType, typename MemoryType>
const SequenceType& MDAT::SequenceSetBase< SequenceType, MemoryType >::operator[] ( const std::string &  seq_name) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

template<typename SequenceType , typename MemoryType >
void MDAT::SequenceSetBase< SequenceType, MemoryType >::read ( const std::string &  seq_f,
const std::vector< std::string > &  seq_names,
bool  check = false,
short  format = -1 
)
virtual

Extracts a subalignment from the sequence set.

Only sequences which are denoted in seq_names are extracted. Names in seq_names not occurring in the alignment are ignored. Columns consisting of gaps only are removed.

Parameters
seq_fThe file of the sequences to read.
seq_namesThe names to read.
formatThe format of the alignment. (-1 enables automatic format detection)
checkChecks if the sequence is a proper biological sequence
template<typename SequenceType, typename MemoryType>
virtual void MDAT::SequenceSetBase< SequenceType, MemoryType >::read ( const std::string &  seq_f,
bool  check = false,
short  format = -1 
)
inlinevirtual

Reads a set of sequences.

This function can read unaligned sequences in FASTA format as well as aligned sequences in several formats.

Parameters
seq_fThe file with the sequences to read.
formatThe format of the alignment. (-1 enables automatic format detection)
checkChecks if the sequence is a proper biological sequence
template<typename SequenceType, typename MemoryType>
const SequenceType* MDAT::SequenceSetBase< SequenceType, MemoryType >::seq ( unsigned int  index) const
inline

Returns a sequence.

Parameters
indexIndex of the sequence.
Returns
const reference to the sequence.
template<typename SequenceType, typename MemoryType>
char MDAT::SequenceSetBase< SequenceType, MemoryType >::seq_type ( ) const throw ()
inline

Returns type.

Returns
The sequence type.
template<typename SequenceType, typename MemoryType>
void MDAT::SequenceSetBase< SequenceType, MemoryType >::seq_type ( char  seq_type_) throw ()
inline

Sets the sequence type.

Parameters
seq_type_The sequence type.
template<typename SequenceType, typename MemoryType>
void MDAT::SequenceSetBase< SequenceType, MemoryType >::share ( const SequenceSetBase< SequenceType, MemoryType > &  set,
size_t  id 
)
inline

Shares a sequence between two sets.

Parameters
setThe set to take the sequence from.
idThe index of the sequence.
template<typename SequenceType, typename MemoryType>
size_t MDAT::SequenceSetBase< SequenceType, MemoryType >::size ( ) const
inline

returns the number of sequences.

Returns
The number of sequences
template<typename SequenceType , typename MemoryType >
void MDAT::SequenceSetBase< SequenceType, MemoryType >::sort ( std::string  type)

Sorts the sequences.

Parameters
type"input" sorts the sequences by order of the input. "name" sorts by sequence name. "seq" sorts the sequences by alphabetical order.
template<typename SequenceType, typename MemoryType>
void MDAT::SequenceSetBase< SequenceType, MemoryType >::transfer ( SequenceSetBase< SequenceType, MemoryType > &  set,
size_t  id 
)
inline

Transfer a sequence from one set to another.

Parameters
setThe set to take the sequence from.
idThe index of the sequence.
template<typename SequenceType, typename MemoryType>
void MDAT::SequenceSetBase< SequenceType, MemoryType >::transfer ( SequenceSetBase< SequenceType, MemoryType > &  set)
inline

Transfers all sequences from one set to another.

Parameters
setThe set to take the sequence from.
template<typename SequenceType , typename MemoryType >
void MDAT::SequenceSetBase< SequenceType, MemoryType >::write ( const std::string &  seq_f,
const std::string  format 
) const
virtual

Writes the sequences into a file.

This function supports the following formats: FASTA, MSF.

Parameters
seq_fThe file to write the alignment to
formatThe format to use (fasta, clustalw, msf, phylip_i, phylip_s)

Friends And Related Function Documentation

template<typename SequenceType, typename MemoryType>
template<typename SeqType , typename MemType >
std::ostream& operator<< ( std::ostream &  out,
const SequenceSetBase< SeqType, MemType > &  seqSet 
)
friend

Simple print of a SequenceSet in fasta format.

Parameters
out[in|out]The output stream.
seqSet[in]The sequence set
Returns
The output stream