I'd like to fetch the DNA sequence of (some feature or coordinates). How do I go about this?

WormBase offers many ways to fetch sequences of features.

Using the Genome Browser

Enter the desired coordinates into the Genome Browser using the format "CHROMOSOME:START..STOP" (eg. X:10..1000). If you don't know the coordinates of the feature of interest, just search for the feature itself. Right-click and drag with your mouse across the scale to select the region you are interested in. A window will pop up giving a choice of options. Select "Dump selection as FASTA"

Using WormMart

Click Here for some example WormMart Queries.

How can I download all the [3' UTR] sequences from the C. elegans genome?

The best way to download all sequences (for example the 3' UTR sequences) is through WormBase ParaSite BioMart.

Here are the steps: Go to: http://parasite.wormbase.org/biomart/martview

Where can I get repeatmasked genomic sequences for C. elegans, C. briggsae, or C. remanei?

For C. elegans, using the current (5 July 2012) most recent archival release of the database, you can get repeatmasked chromosomal sequences here:

You will need to uncompress the relevant files with "gunzip c_elegans.WS232.genomic_masked.fa.gz" or a similar command.

Note that in all of these cases, the sequences are hardmasked: i.e., the repeat sequences have been replaced by stretches of "N" residues, instead of being marked in some less information-destroying way. By contrast, softmasked sequences would keep the repeat sequences but distinguish them by changing their case: non-repeat sequences would be UPPERCASE, while repeat sequences embedded between the non-repeat sequences would be lowercase.

How do I get the sequence of an old Brugia malayi protein from the TIGR publication?

For instance, sna-1 is annotated as being orthologous to B. malayi 13258.m00169, and paralogous to B. malayi 14704.m00455; but where do I go to get these sequences?

For the time being, the best method for getting TIGR B. malayi sequences quickly (and without having to download the entire predicted B. malayi proteome by 'licensed FTP') is to do a BlastP search, against the "BMA1_pep" protein sequence set, on TIGR's B. malayi Blast server at:

http://tigrblast.tigr.org/er-blast/index.cgi?project=bma1

A successful BlastP search will give a report that has hypertext links to individual protein sequences such as 13258.m00169 and 14704.m00455.

Another option would be to do a bulk download of the relevant sequence data from TIGR itself. See TIGR's data release policy at http://www.tigr.org/tdb/e2k1/bma1/ for more details.

This is mostly to put the B.malayi data into historical context, as the current Brugia malayi assembly and geneset available on WormBase is more comprehensive, complete and regularly updated.


Last edited by Chris Grove – 44 days ago