This morning Ivan Ovcharenko, from Lawrence Livermore is presenting on comparative sequence analysis. I will update over the course of the morning.
Comparative genomics. Human genome is done: 50% is junk -- only 3% codes for proteins. The function of the other 47% is largely unknown. Comparative sequence analysis: biologically functional regions tend to stay conserved through evolution. By aligning homologous sequences, we can identify evolutionary conserved regions of at least 80 residues (ECRs) with a putative functional importance. In sequence aligment, you find matches, find mismatches, and insert gaps to linearize the alignment. The problem is how to deal with huge alignments? Early approaches included graphical conservation profiles: (1) Percent identity plots. (2) Smooth graphs.
Comparative genomics to genome biology. Experimental assessment of the biological function of ECRs. Researcher knocked out small non-coding region in mouse, which reduced expression of three other genes.
Aligning vertebrate genomes. Human genome is big. Used heuristics to reduce task: mask out repetetive elements (repeatmasker), map syntenic regions in two genomics (BLAT), align syntenic revions (BLASTZ), visualize alignments (ECR browser). Comparing human and mouse: 10% of genome is conserved (1 million ECRs). Comparing human and fugu, 0.2% is conserved (41,067 ECRs).
Gene Deserts: There are long regions that contain no coding elements. What is the function? Gene deserts relatively strongly conserved between human and mouse. Many present in chicken as well, some in fugu. Low GC content and SNPs suggest few functional elements. Gene deserts include regulatory elements. "Core" ECRs are those preserved in human, mouse, chicken, and fugu. Use these as a filter to identify human-mouse ECRs.
Phylogenetic shadowing: compare humans with, not one, but many primate species. Use 2-state HMM model to identify conserved elements. Use human - mouse to remove generic vertebrate sequences and use primate to identify just primate matches.
The afternoon presentation is by Stephen J. Everse speaking on "Understanding Coagulation: A Systems Biology Approach". Coagulation cascade: two forms of the pathway and lots of factors. The pathways essentially result in amplification to produce a lot of thrombin. This talk focusing on the "extrinsic" pathway. Hemostatis is a balance between making a clot and dissolving a clot. Methods of analysis: purified system, whole blood, computational models.
Whole-blood experiment: phlebotomy, contact pathway suppression by CTI, initiation by TF, quenching at intervals with EDTA and peptidyl chloromethyl ketones, analysis of current conditions. Results: clotting begins in about 5 minutes, thrombin complex increases over 10 minutes. But platelets get activated very early on after only 5% of total thrombin has been produced.
Computational models: what happens with all that extra thrombin? Use a complex computer model to explore. Extrinsic pathway initiates the intrinsic pathway, that produces most of the thrombin. After propagation, the model moves to termination, to keep the process in check. Comparing mathematical model with real data: match is pretty good. Now what? Could we use the model system to derive an inhibitor with particular properties to, say, remediate clotting problems as people age?
Looking at a variety of inhibitors, most did little and hirudin was too strong (didn't act in a linear fashion) and could interact with other factors, including oral contraceptives. Models may not apply in composite to a population, but maybe a model could be constructed for an individual. Longitudinal studies may allow the development of models of clotting behavior in individuals.
Structural basis for thrombin: in extrinsic pathway, prothrombinase, factor Xa, factor VaHC and LC, plus Ca+ on a membrane. Adding the cofactor increases the rate of production 300,000-fold. If you deplete the calcium, the reaction slows way down. Factor V has three A domains, a B domain, and two C domains. Factor V created when B domain cleaved out with A domains wrapped around calcium. Use X-ray crystallography to try to determine structure.
I have finished our lac operon model. I'm quite pleased with it. And there might even be time for some more model construction tonight or tomorrow morning. But first we should get the presentation nailed down for tomorrow.
I'm so happy! One of the presenters this week showed how elements in a signal transduction pathway could act as logical gates (a NAND gate), so as I was thinking about gene expression, I thought about how you could use that system to create logical gates. So I created a simulation I'm calling Logical Promoter that lets you simulate using genes to create any kind of logical gate. It was great!