Index of /download/homology/orthologs/A_nidulans_FGSC_A4_S_cerevisiae_by_inparanoid
Name Last modified Size Description
Parent Directory -
A_nidulans_FGSC_A4_S_cerevisiae_orthologs.txt 23-Jul-2010 10:40 62K
inparanoid_output.07-22-2010.txt 23-Jul-2010 10:40 1.3M
orf_trans_all_Aspergillus_nidulans.07-22-2010.fasta.gz 23-Jul-2010 10:40 3.7M GZIP compressed docume>
orf_trans_all_Saccharomyces.07-22-2010.fasta.gz 23-Jul-2010 10:40 2.4M GZIP compressed docume>
rejected_sequences.wormpep.07-22-2010.fasta.gz 23-Jul-2010 10:40 3.4K GZIP compressed docume>
wormpep.07-22-2010.fasta.gz 23-Jul-2010 10:40 5.3M GZIP compressed docume>
This directory contains the input sequences that were used to
determine orthology assignments between A. nidulans and
S. cerevisiae, using InParanoid version 3.0 (http://inparanoid.sbc.su.se/)
and the output file that was generated from InParanoid. In addition,
a file containing the processed output, listing orthology assigments
is also provided. The ortholog mappings are updated quaterly to ensure
that the predictions are based on the most up-to-date information.
To run InParanoid, the A. nidulans proteins from AspGD were compared to
the latest set of S. cerevisiae proteins from SGD; the set of
C. elegans proteins from the Sanger Institute was used as an outgroup.
Stringent cutoffs were set: BLOSUM80 (instead of the default BLOSUM62),
and an InParanoid score of 100%.
Note, that the ortholog pairings were automatically generated, with no
curator intervention. Thus, there will occasionally be pairings that
may not occur with a different scoring matrix. In the interests of
automating the process, we do not intend to hand-curate the ortholog
pairs at this time.
For A. nidulans proteins that did not have an ortholog that meets
these criteria, we used BLASTp, using the same parameters as were used
by InParanoid (-F \"m S\" -M BLOSUM80) with an expectation value (E)
of 1e-5 to identify their best hit in the S. cerevisiae protein
complement. These best hits data are available here:
http://www.aspergillusgenome.org/download/homology/best_hits/
in the same format as the files containing the ortholog data.
The following files are available:
orf_trans_all_Aspergillus_nidulans.MM-DD-YYYY.fasta.gz - the A. nidulans protein complement
orf_trans_all_Saccharomyces.MM-DD-YYYY.fasta.gz - the S. cerevisiae protein complement
wormpep.MM-DD-YYYY.fasta.gz - the C. elegans protein set used as an outgroup
inparanoid_output.MM-DD-YYYY.txt - the raw output from InParanoid
rejected_sequences.wormpep.MM-DD-YYYY.fasta.gz - the sequences rejected due to the worm outgroup
A_nidulans_FGSC_A4_S_cerevisiae_orthologs.txt - the processed output, with the AN id, the SGDID,
and the gene/ORF name from SGD
The dates (indicated by MM-DD-YYYY) in the above file names represent the date when
the input files were downloaded and latest set of ortholog predictions generated.