Index of /download/sequence/A_nidulans_FGSC_A4/current
Name Last modified Size Description
Parent Directory -
A_nidulans_FGSC_A4_unmapped_Broad_CDS.fasta.gz 24-Feb-2011 18:11 26K GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_chromosomes.fasta.gz 29-Jan-2012 15:10 8.7M GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_not_feature.fasta.gz 29-Jan-2012 22:54 3.8M GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_orf_coding.fasta.gz 29-Jan-2012 19:01 5.5M GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_orf_genomic.fasta.gz 29-Jan-2012 16:49 6.1M GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_orf_genomic_1000.fasta.gz 29-Jan-2012 16:01 13M GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_orf_trans_all.fasta.gz 29-Jan-2012 21:13 3.8M GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_other_features_genomic.fasta.gz 29-Jan-2012 21:15 8.2K GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_other_features_genomic_1000.fasta.gz 29-Jan-2012 21:14 130K GZIP compressed docume>
A_nidulans_FGSC_A4_version_s07-m01-r04_other_features_no_introns.fasta.gz 29-Jan-2012 21:16 7.3K GZIP compressed docume>
EMBL_format/ 31-Jan-2012 19:11 -
This directory contains the most current version of Aspergillus nidulans FGSC A4
genomic sequences.
The notation "sXX-mYY-rZZ" in the filename indicates the genome version to which data in
the file corresponds. Detailed explanation about the genome version can
be found at: http://www.aspgd.org/help/SequenceHelp.shtml#Anids_versions
PLEASE NOTE: The sequence file names, as well as the chromosome identifiers within
the files, were updated in July and August 2010 to include the name of the species
and strain. This change was necessary to accommodate multiple Aspergillus and
Aspergillus-related species and strains at AspGD.
These files are updated weekly:
* Chromosomal sequence:
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_chromosomes.fasta.gz
* Sequence with no introns for all ORFs:
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_orf_coding.fasta.gz
* Sequence with introns for all ORFs:
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_orf_genomic.fasta.gz
* Sequence with introns and untranslated region 1000 bp upstream and
downstream for all ORFs:
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_orf_genomic_1000.fasta.gz
* Translation of all ORFs:
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_orf_trans_all.fasta.gz
* Sequence of tRNAs (predicted using tRNAscan-SE), includes sequence of any introns:
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_other_features_genomic.fasta.gz
Note: sequence of other non-ORF feature types will be added to this file when these
features are added to AspGD in the future.
* Sequence of tRNAs (predicted using tRNAscan-SE), with introns removed:
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_other_features_no_introns.fasta.gz
Note: sequence of other non-ORF feature types will be added to this file when these
features are added to AspG in the future.
* Genomic sequence of tRNAs (predicted using tRNAscan-SE), plus region
1000 bp upstream and downstream:
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_other_features_genomic_1000.fasta.gz
Note: sequence of other non-ORF feature types will be added to this file when these
features are added to AspG in the future.
* Sequence between annotated chromosomal features (see note below):
A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_not_feature.fasta.gz
The file A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_not_feature.fasta.gz
contains DNA sequences which are between chromosomal features.
Features excluded from this file are ARS, ORF, centromere, long_terminal_repeat,
ncRNA, pseudogene, rRNA, retrotransposon, snRNA, snoRNA, tRNA, telomere,
telomeric_repeat, transposable_element_gene, blocked_reading_frame, repeat_region.
The file is in compressed FASTA format. This file is updated whenever features
are added or removed or there are changes to feature boundaries.
The archive/ directory lists all the A_nidulans_FGSC_A4_version_sXX-mYY-rZZ_not_feature.fasta.gz
files from the past.
This file is static (not updated):
* Coding sequence (no introns) from the Broad annotation updates (2009)
that could not be mapped completely to the Version 4 chromosomal sequence:
A_nidulans_FGSC_A4_unmapped_Broad_CDS.fasta.gz
PLEASE NOTE: The file A_nidulans_FGSC_A4_unmapped_Broad_CDS.fasta.gz
contains new genes in the Broad Institute (2009) annotation set that were
un-mappable to the Version 4 chromosomal sequence, and consequently
were not assigned chromosomal coordinates or sequence in the initial
release of the Version 5 annotation. The sequence in this file is
based on the Version 3 contigs. This file does not contain the
coding sequence for gene models from the Broad annotation that were
intended to be updated versions of existing genes. These modifications
were rejected from inclusion in the Version 5 reference annotation
because the gene could not be mapped completely to the Version 4
chromosomal sequence. (We can provide these sequences upon request.)
The processing of the Broad updates and generation of Version 5
of the annotation is described in more detail in the AspGD
Sequence Documentation:
http://www.aspgd.org/help/SequenceHelp.shtml#Anids_assemblies.
#################################################################################
The files in this directory are in FASTA format.
The FASTA header lines include the AN and ANID identifier for each
ORF, the contig number (Version 3), the contig coordinates, the strand,
a brief description from the Broad, and the length of the sequence
in nucleotides.
All files are gzip compressed. There are several freely available
software options for decompressing gzipped files using Windows. The
software and other useful information is available on these web sites:
- WinZip (http://www.winzip.com/)
- Stuffit (http://www.stuffit.com/)
- Gzip (http://www.gzip.org/
and the gzip user's manual:
http://www.math.utah.edu/docs/info/gzip_toc.html
Additional sequence documentation is found on the AspGD web site at:
http://www.aspergillusgenome.org/help/SequenceHelp.shtml
------------------------------------------------