Rules for PDB Comparison Tables


Last Update: 2001-10-26 Gail

These tables are used for storing PDB homologs and data from sequence comparison analysis.


TABLE PDB_Sequence
====================
pdb_sequence_no	Required.
		Primary key.  Oracle sequence number.
		Cannot be updated.
		NUMBER(10)

sequence_name	Required.
                VARCHAR2(50)

source		Required.
                Coded.
                On insert or update, check if value is in code table.
                VARCHAR2(40)

organism	Required.
                Coded.
                On insert or update, check if value is in code table.
                VARCHAR2(40)

sequence_length	Required. 
                NUMBER(10)

note		Optional.
                VARCHAR2(960)

date_created    Required.
                On insert, set to SYSDATE.
                On update, keep original date.
 
created_by      Required.
                On insert, set to USER

Other notes/rules:
===================
Organism is a coded value.  For the PDB data, Shuai's program will
first check to see if the PDB Source field matches an organism in the
code table.  If it does, it will insert it.  If it doesn't match,
organism will get "Other" or "Unknown", and the Source field will get
inserted into the description column, along with the Classification
field from PDB. I will make a list of synonyms for organism that Shuai
can check before loading.

Each row in the bio_seq table, including yeast sequences, should have
a corresponding row in the external_id table.

The type column will be values like "promoter", "consensus", "ORF".
The molecule column will be values like "DNA" or "protein".

There should NOT be cascade delete for this table ie. you cannot
delete a row in the bio_seq table if there are rows in:
bioseq_dataset, bioseq_family, or the alignment table.

Nightly checking script: check for rows in the bioseq table that are
not in bioseq_dataset, but are in the alignment table



TABLE PDB_Alignment
=====================
pdb_alignment_no	Required.
			Primary key.  Oracle sequence number.
			Cannot be updated.
			NUMBER(10)

query_seq_no		Required.
                        Oracle sequence number.
			Foreign key to pdb_sequence table

target_seq_no		Required.
                        Oracle sequence number.
			Foreign key to pdb_sequence table

method			Required.
                	Coded.
                	On insert or update, check if value is in code table.
                	VARCHAR2(40)

matrix			Required.
                	Coded.
                	On insert or update, check if value is in code table.
                	VARCHAR2(40)

query_align_start_coord	Required.
			NUMBER(10)

query_align_stop_coord	Required.
			NUMBER(10)

target_align_start_coord Required.
			NUMBER(10)

target_align_stop_coord	Required.
			NUMBER(10)

pct_aligned		Required.
			NUMBER(5,2)

pct_identical		Required.
			NUMBER(5,2)

pct_similar		Required.
			NUMBER(5,2)

score			Required.
			NUMBER(8,3)

date_created    	Required.
                	On insert, set to SYSDATE.
                	On update, keep original date.
 
created_by      	Required.
                	On insert, set to USER

Other notes
-----------
1. query_seq_no and target_seq_no combined are unique.



TABLE Alignment_sequence
==========================
alignment_sequence_no	Required.
			Primary key.  Oracle sequence number.
			Cannot be updated.
			NUMBER(10)

tab_name                Required.  Uppercase.
                        On insert or update, check if value is in ALL_TABLES.
                        VARCHAR2(30)

primary_key             Required.
                        VARCHAR2(40)

primary_key_col         Required. Uppercase.
                        VARCHAR2(240)

date_created    	Required.
                	On insert, set to SYSDATE.
                	On update, keep original date.
 
created_by      	Required.
                	On insert, set to USER

query_seq		Required.
			VARCHAR2(4000)

target_seq		Required.
			VARCHAR2(4000)

alignment_symbol	Required.
			VARCHAR2(4000)

Other Rules
-----------
1. tab_name, primary_key, primary_key_col together are unique.