DOGMA
DOGMA is a program for fast and easy quality assessment of transcriptome and proteome data based on conserved protein domains. The source code can be obtained at https://zivgitlab.uni-muenster.de/domain-world/DOGMA and a webserver to run your analysis directly in the browser without installation is available here.
The figure shows schematically the method of DOGMA for example proteome data. The output of DOGMA gives you a completeness score measuring the quality of your proteome or transcriptome data from 0 - 100%.
On this website you can find additional data needed for evaluating your proteome/transcriptome. You will need the core sets (unless you want to create your own) and eventually our fast domain annotation tool RADIANT instead of PfamScan.
DOGMA is written in python and runs with any version from 2.7 on (including python 3). For help and instructions on how to use DOGMA please check the UserManual.
Core sets
To run DOGMA you need a core set with conserved protein domains you can compare your proteome/transcriptome data to. We provide several precomputed core sets for different clades that can be downloaded here for the newest DOGMA version:
core sets | size (unzipped) | pfam version | comment |
---|---|---|---|
pfam37.tbz | 99 MB (122 MB) | pfam v37 | Contains 11 core sets for pfam version 37 and has to be unpacked into the DOGMA folder. |
pfam36.tbz | 99 MB (121 MB) | pfam v36 | Contains 11 core sets for pfam version 36 and has to be unpacked into the DOGMA folder. |
pfam35.tbz | 96 MB (117 MB) | pfam v35 | Contains 11 core sets for pfam version 35 and has to be unpacked into the DOGMA folder. |
pfam34.tbz | 91 MB (111 MB) | pfam v34 | Contains 11 core sets for pfam version 34 and has to be unpacked into the DOGMA folder. |
pfam33.1.tbz | 90 MB (109 MB) | pfam v33.1 | Contains 11 core sets for pfam version 33.1 and has to be unpacked into the DOGMA folder. |
pfam32.tbz | 91 MB (108 MB) | pfam v32 | Contains 11 core sets for pfam version 32 and has to be unpacked into the DOGMA folder. |
pfam31.tbz | 88 MB (104 MB) | pfam v31 | Contains 11 core sets for pfam version 31 and has to be unpacked into the DOGMA folder. |
Currently core sets for the following clades are included: eukaryotes, archaea, bacteria, arthropods, insects, vertebrates, mammals, fungi, plants, monocots and eudicots. Alternatively, you can create your own core sets. For more information about this or for information about the included species in the precomputed core sets please check the UserManual.
Contact the developer
If you find a problem, have questions or any kind of comment please contact us (domainworld[@]uni-muenster.de).
Citation
If you used DOGMA in your project please cite our publication:
Elias Dohmen, Lukas P.M. Kremer, Erich Bornberg-Bauer, and Carsten Kemena, DOGMA: domain-based transcriptome and proteome quality assessment, Bioinformatics (2016) 32 (17): 2577-2581. doi:10.1093/bioinformatics/btw231 http://bioinformatics.oxfordjournals.org/content/32/17/2577
Carsten Kemena, Elias Dohmen, Erich Bornberg-Bauer, DOGMA: a web server for proteome and transcriptome quality assessment, Nucleic Acids Research, 2019, doi:10.1093/nar/gkz366 https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkz366/5488015
Example publications utilizing DOGMA
- Touma J, García KK, et al., De novo Assembly and Characterization of Patagonian Toothfish Transcriptome and Develop of EST-SSR Markers for Population Genetics, Frontiers in Marine Science, 2019, https://doi.org/10.3389/fmars.2019.00720
- Cai H, Li Q, et al., A draft genome assembly of the solar-powered sea slug Elysia chlorotica, Scientific Data, 2019. https://doi.org/10.1038/sdata.2019.22
- Stevens KA, Wegrzyn JL, et al., Sequence of the Sugar Pine Megagenome, Genetics, 2016. https://doi.org/10.1534/genetics.116.193227