【正文】
究重復序列也具有十分重要的意義。 /基因組的注釋 (1)重復序列分析 真核生物的基因組相當于基因的一股由只有一個復制 DNA序列(也稱單一 DNA, unique sequence, single copy seqence, nonrepetitive sequence等)和具有多數(shù)反復存在的 DNA順序組成。 FASTA首先在序列庫中進行快速的初檢,找出與待檢序列高度相似的序列。 一次只能進行一條蛋白質(zhì)序列和一條 dna序列的比對。 Programs in HMMER Currently, the HMMER package contains nine programs. Two of these are programs for database searching: ? hmmpfam Search an HMM database for matches to a query sequence. ? hmmsearch Search a sequence database for matches to a single profile HMM. The other programs in the package are: ? hmmalign Align sequences to an existing model. ? hmmbuild Build a model from a multiple sequence alignment. ? hmmcalibrate Takes an HMM and empirically determines parameters that are used to make searches more sensitive, by calculating more accurate expectation value scores (Evalues). ? hmmconvert Convert a model file into different formats, including a pact HMMER 2 binary format, and best effort emulation of GCG profiles. ? hmmemit Emit sequences probabilistically from a profile HMM. ? hmmfetch Get a single model from an HMM database. ? hmmindex Index an HMM database. (2)局部比對 I blast: 基于局部比對算法的搜索工具,可用于核酸和蛋白質(zhì)序列的局部比對??梢詮? 處免費下載 HMMER 應用程序包。 II MUSCLE MUSCLE是一個開源軟件,它的作用是可以對蛋白質(zhì)和核酸進行多序列比對,在運行速度和精度上都比 clustal w要好,它可以在網(wǎng)絡上運行,也可以下載到本地運行。現(xiàn)在的版本是 clustal w2 Clust w2可以用于核酸或蛋白質(zhì)的多序列比對,也可以用來構(gòu)建系統(tǒng)進化樹。對應的相同或相似的符號(在核酸中是 A, T(或U) , C, G,在蛋白質(zhì)中是氨基酸殘基的單字母表示)排列在同一列上。 ? 將兩個或多個序列排列在一起,標明其相似之處。 and 339。 Cross_match is a general purpose utility for paring any two DNA sequence sets using the SmithWaterman algorithm. For example, it can be used to pare a set of reads to a set of vector sequences and produce vectormasked versions of the reads, a set of cDNA sequences to a set of cosmids, contig sequences found by two alternative assembly procedures (for example, phrap and xbap) to each other, or phrap contigs to the final