【正文】
**大學 學院 工學學士學位 論文(設計) 題目: 基于 Web 的行業(yè)新聞采集系統(tǒng) 學 號: 姓 名: 院 (系 ): 信息工程學院 專 業(yè): 信息管理與系統(tǒng) 完成日期: 指導老師: **大學 學院工學學士學位論文 摘要 I 摘 要 隨著 互聯(lián)網(wǎng)的飛速發(fā)展,信息時代的到來,面對網(wǎng)絡上泛濫的新聞信息,而采集和過濾一些有用的信息對于我們來說是十分重要的。行業(yè)新聞采集系統(tǒng)是將非結(jié)構(gòu)化的新聞文章從多個新聞來源網(wǎng)頁中抽取出來保存到結(jié)構(gòu)化的數(shù)據(jù)庫中的過程。 尤其是對于大型門戶網(wǎng)站,比如新浪,騰訊他們每天的網(wǎng)站信息都更新,而且范圍很廣,全國各地,甚至全球發(fā)生的信息都能每天看到更新,而他們正是利用采集系統(tǒng)從各大媒體網(wǎng)站,外國網(wǎng)站采集過來的。 因此,信息的采集至關重要。一般的網(wǎng)站新聞發(fā)布平臺都是采用人工輸入信息,對于中小型網(wǎng)站這樣的工作量很算可以,但是網(wǎng)站大了 ,信息就很龐大了,像那種分類信息網(wǎng),更新的工作就變得很復雜,如果有專門的類似搜索引擎能檢索采集到最新的相關信息然后發(fā)布在自己的網(wǎng)站上,因此建立專門的行業(yè)新聞采集系統(tǒng),從相關網(wǎng)站采集有效的新聞信息可以減少很多工作量,而且可以有效的進行修改和過濾工作。目前比較有名的采集系統(tǒng)有火車頭,視采新聞采集器 [2]、萬能新聞采集器、新浪新聞采集器 。 關鍵字:信息采集;行業(yè)新聞采集; .; SQL server **大學 學院工學學士學位論文 Abstract II Abstract With the rapid development of the Inter, the advent of the information age, face the flood of news information work, and the collection and filter some useful information for us, it is very important. News gathering system is will unstructured news articles from multiple sources of news page extracted saved to the structural database in process. Especially for large web portal, such as sina, tencent every day they website information updates, and the range is very wide, all over the country, and even the global happened to see update information every day, and they are the use of acquisition system from the major media web site, foreign web site collected. Therefore, the collection of the information is very important. The general web news release platform are using artificial input information, for small and medium website such workload is calculate can, but the site is big, the information is very big, like the classification and information work, update the work is very plex, if have special similar search engine can retrieve collection to the latest information and then released on his website, thus establishing special collection system, from related website collection effective news information can reduce a lot work load, and can effectively modified and filter work. At present more famous collection system has a lootive, depending on the mining news terminal [2]。 Universal news collector。 Sina news collector. Keywords: news collection; Information collection; .; SQL server **大學 學院工學學士學位論文 目錄 III 目 錄 摘 要 ................................................................................................................................. I Abstract ............................................................................................................................... II 摘 要 ................................................................................................................................... I Abstract................................................................................................................................. II 第一章 引 言 .......................................................................................................................1 課題背景 ..................................................................................................................1 開發(fā)系統(tǒng)的意義 ........................................................................................................1 課題名稱 ..................................................................................................................2 問題描述 ..................................................................................................................2 第二章 可行性研究 ................................................................................................................3 經(jīng)濟可行性 ...............................................................................................................3 技術(shù)可行性 ...............................................................................................................3 開發(fā)工具簡介 ...........................................................................................................4 工廠模式三層架構(gòu)介紹 .............................................................................................6 第三章 系統(tǒng)分析 ...................................................................................................................9 功能需求 ..................................................................................................................9 性能需求 ..................................................................................................................9 運行需求 ..................................................................................................................9 數(shù)據(jù)流圖 ................................................................................................................ 10 用例圖 .................................................................................................................... 11 數(shù)據(jù)字典 ................................................................................................................ 14 概念結(jié)構(gòu)設計 ......................................................................................................... 16 邏輯結(jié)構(gòu)設計 ......................................................................................................... 18 數(shù)據(jù)庫主要表結(jié)構(gòu)說明 ........................................................................................... 20 物理結(jié)構(gòu)設計 ....................................................................................................... 21 第四章 總體設計 ................................................................................................................. 24 總體功能模塊設計 ................................