【正文】
that graph theory methods may be applied for data mining. The database can be used to study works of interactions, to map pathways across taxonomic branches and to generate information for kiic simulations. Industrial Companies in Path Informatics ? Protein Pathways, Los Angeles, USA ? Genmetrics, Inc., Silicon Valley, USA ? Biobase, Braunschweig, Germany ? InforMax, Bethesda, MD and AxCell Bioscience, Newtown, PA ? Myriad Proteomics, Salt Lake City, Utah ? CuraGen Corporation, New Haven, CT, USA Objectives of the KEGG Project ? Pathway Database: Computerize current knowledge of molecular and cellular biology in terms of the pathway of interactiong molecules or genes. ? Genes Database: Maintain gene catalogs of all sequenced anisms and link each gene product to a pathway ponent ? Ligand Database: Organize a database of all chemical pounds in living cells and link each pount to a pathway ponent ? Pathway Tools: Develop new bioinformatics technologies for functional genomics, such as pathway parison, pathway reconstruction, and pathway design ? Professor M. Kanehisa is the leading scientist on the project Data Representation in KEGG ? Entity: a molecule or a gene ? Binary relation: a relation between two entities ? Network: a graph formed from a set of related entities ? Pathway: metabolic pathway or regulatory pathway Drosophila melanogaster Genes According to the KEGG metabolic and regulatory pathways Pathway Search by [ EC | Cpd | Gene | Seq ] [ 1st Level | 2nd Level | 3rd Level | Text Search ] 1. Carbohydrate Metabolism 2. Energy Metabolism Oxidative phosphorylation [PATH:dme00190] ATP Synthesis [PATH:dme00193] Carbon fixation [PATH:dme00710] Reductive carboxylate cycle (CO2 fixation) [PATH:dme00720] Methane metabolism [PATH:dme00680] Nitrogen metabolism [PATH:dme00910] Sulfur metabolism [PATH:dme00920] 3. Lipid Metabolism 4. Nucleotide Metabolism 5. Amino Acid Metabolism 6. Metabolism of Other Amino Acids 7. Metabolism of Complex Carbohydrates 8. Metabolism of Complex Lipids 9. Metabolism of Cofactors and Vitamins Introduction to GenMAPP ? Gene MicroArray Pathway Profiler by Bruce Conklin at Gladstone Institute, UCSF. ? GenMAPP is a free puter application designed to visualize gene expression data on maps representing biological pathways and groupings of genes. ? The main features underlying GenMAPP version are: – Draw pathways with easy to use graphics tools – Multiple species gene databases – Color genes on MAPP files based on userimported gene expression data Part II. Path Metrics Software Tools for Developing Pathway Database, Performing Pathway Comparison, and Making Pathway Prediction Topics to Cover ? SLIPPIR standard for pathway database model ? Gene, pathway, and tissue expression tools ? Pathway search engine ? Ortholog pathway prediction ? Pathway prediction user interface SLIPPIR standard for pathway curation SLIPPIR standards for Standard for LInear ProteinProtein Interaction Representation. ? For linear parison (homology), ? 2D diagrams of pathways ?1D format. ? We call the 2D diagrams graph pathways, and the corresponding 1D pathways linear pathways. ? One graph pathway may be transformed into multiple linear pathways. The generation of graph pathways and the corresponding linear pathways from scientific literature is called pathway curation. ? Pathways are curated by trained scientists with expertise on the relevant pathways. In addition to generating the graph pathway and linear pathways, they also have to generate a pathway description file for each pathway they curate (pathway annotation), and a protein file that contains all the proteins in the pathway. Mode Symbol Specifications It is usually specified by two noncharacter ASCII symbols. ? Direct interaction with direction. Used when there is known direct interactions between two nodes (reverse orientation: ). ? | Direct inhibition with direction. Used when there is a direct inhibition from one node to the next. | for reverse orientation. ? Association, indirect action. Used when there is uncertain interaction, indirect interaction, or simply coexpression. ?= = Parallel members. The members can all serve the same function. Usually variants of the same gene, or members from the same family. ? Clear interaction, but no direction of information flow (notice, no space within, no letters either). This could happen when more than two proteins are involved to form a large plex. ?** Bifurcating members (usually appears only in beginning or ending of a pathway, it can occur in the middle of a pathway only when a pathway bifurcates and immediately folds back, . AB**C**EF). ?If a pathway starts to bifurcate in the middle or at the end, one can use a **[path_name] to record this event. : ?AB(xx)CD**[New_path_1]E**[New_path_2]. ?( ) Symbol for nonprotein nodes. If the small molecule is uncertain, it can be omitted. If the small molecule is known, its name should be inserted in between, . (Ca), or (cAMP). All the small molecules should be included inside a set