These tables are used for storing PDB homologs and data from sequence comparison analysis.
TABLE PDB_Sequence
====================
pdb_sequence_no Required.
Primary key. Oracle sequence number.
Cannot be updated.
NUMBER(10)
sequence_name Required.
VARCHAR2(50)
source Required.
Coded.
On insert or update, check if value is in code table.
VARCHAR2(40)
organism Required.
Coded.
On insert or update, check if value is in code table.
VARCHAR2(40)
sequence_length Required.
NUMBER(10)
note Optional.
VARCHAR2(960)
date_created Required.
On insert, set to SYSDATE.
On update, keep original date.
created_by Required.
On insert, set to USER
Other notes/rules:
===================
Organism is a coded value. For the PDB data, Shuai's program will
first check to see if the PDB Source field matches an organism in the
code table. If it does, it will insert it. If it doesn't match,
organism will get "Other" or "Unknown", and the Source field will get
inserted into the description column, along with the Classification
field from PDB. I will make a list of synonyms for organism that Shuai
can check before loading.
Each row in the bio_seq table, including yeast sequences, should have
a corresponding row in the external_id table.
The type column will be values like "promoter", "consensus", "ORF".
The molecule column will be values like "DNA" or "protein".
There should NOT be cascade delete for this table ie. you cannot
delete a row in the bio_seq table if there are rows in:
bioseq_dataset, bioseq_family, or the alignment table.
Nightly checking script: check for rows in the bioseq table that are
not in bioseq_dataset, but are in the alignment table
TABLE PDB_Alignment
=====================
pdb_alignment_no Required.
Primary key. Oracle sequence number.
Cannot be updated.
NUMBER(10)
query_seq_no Required.
Oracle sequence number.
Foreign key to pdb_sequence table
target_seq_no Required.
Oracle sequence number.
Foreign key to pdb_sequence table
method Required.
Coded.
On insert or update, check if value is in code table.
VARCHAR2(40)
matrix Required.
Coded.
On insert or update, check if value is in code table.
VARCHAR2(40)
query_align_start_coord Required.
NUMBER(10)
query_align_stop_coord Required.
NUMBER(10)
target_align_start_coord Required.
NUMBER(10)
target_align_stop_coord Required.
NUMBER(10)
pct_aligned Required.
NUMBER(5,2)
pct_identical Required.
NUMBER(5,2)
pct_similar Required.
NUMBER(5,2)
score Required.
NUMBER(8,3)
date_created Required.
On insert, set to SYSDATE.
On update, keep original date.
created_by Required.
On insert, set to USER
Other notes
-----------
1. query_seq_no and target_seq_no combined are unique.
TABLE Alignment_sequence
==========================
alignment_sequence_no Required.
Primary key. Oracle sequence number.
Cannot be updated.
NUMBER(10)
tab_name Required. Uppercase.
On insert or update, check if value is in ALL_TABLES.
VARCHAR2(30)
primary_key Required.
VARCHAR2(40)
primary_key_col Required. Uppercase.
VARCHAR2(240)
date_created Required.
On insert, set to SYSDATE.
On update, keep original date.
created_by Required.
On insert, set to USER
query_seq Required.
VARCHAR2(4000)
target_seq Required.
VARCHAR2(4000)
alignment_symbol Required.
VARCHAR2(4000)
Other Rules
-----------
1. tab_name, primary_key, primary_key_col together are unique.