【正文】
// (bb)。 (a href=+url++bb+/a)。 } } } return true。 } //執(zhí)行實(shí)際的搜索操作 public ArrayList String crawl(String startUrl, int maxUrls, String searchString,boolean limithost,boolean caseSensitive ) { (searchString=+searchString)。//搜索字符串 HashSet String crawledList = new HashSet String()。 LinkedHashSet String toCrawlList = new LinkedHashSet String()。 if (maxUrls 1) { (Invalid Max URLs value.)。 (Invalid Max URLs value.)。 } if (() 1) { (Missing Search String.)。 (Missing search String)。 } if (() 0) { (err!!!)。 return errorList。 } // 從開(kāi)始 URL 中移出 startUrl = removeWwwFromUrl(startUrl)。 (startUrl)。 while (() 0) { if (maxUrls != 1) { if (() == maxUrls) { break。 } } // Get URL at bottom of the list. String url = ().next()。 // Remove URL from the to crawl list. (url)。 // Convert string url to URL object. URL verifiedUrl = verifyUrl(url)。 // Skip URL if robots are not allowed to access it. //if (!isRobotAllowed(verifiedUrl)) { // continue。 // } // 增加已處理的 URL 到 crawledList (url)。 //(提示搜索過(guò)的 :+verifiedUrl)。//提示搜索過(guò)的 url String pageContents = downloadPage(verifiedUrl)。 if (pageContents != null amp。amp。 () 0){ // 從頁(yè)面中獲取有效的鏈接 //ArrayList String links =retrieveLinks(verifiedUrl, pageContents, crawledList,limitHost)。 HtmlParser parser=new HtmlParser(pageContents)。 ArrayList String links=()。 //(test message!)。 // for(int j=0。j()。j++){ // ( (j))。///測(cè)試是否取出連接 // } (links)。//添加新取得的連接 if (searchStringMatches(url,pageContents, searchString,caseSensitive)) { //(url)。 (該字段存在于: +url)。//輸出找到的地址 } } } return result。 } // 主函數(shù) public static void main(String[] args) { if(!=3){ (Usage:java SearchCrawler startUrl maxUrl searchString)。 return。 } int max=(args[1])。 myspider crawler = new myspider(args[0],max,args[2])。 Thread search=new Thread(crawler)。 (Start searching...)。 (result:)。 ()。 /**/ } } 五、系統(tǒng)測(cè)試 搜索測(cè)試以默認(rèn)開(kāi)始網(wǎng)頁(yè)作為起始頁(yè)面,輸入搜索字符串: 百度 ,如下圖所示: 點(diǎn)擊搜索,開(kāi)始執(zhí)行。執(zhí)行完畢,出現(xiàn)結(jié)果: 搜索成功。 六、結(jié)論 本系統(tǒng)開(kāi)發(fā)過(guò)程中用到了許多學(xué)過(guò)的知識(shí),具體說(shuō)來(lái)有數(shù)據(jù)結(jié)構(gòu)、 java 語(yǔ)言程序設(shè)計(jì)、軟件工程、優(yōu)化理論等等。在編程中發(fā)現(xiàn)這些學(xué)科相互聯(lián)系,相輔相成,在以后更加復(fù)雜的系統(tǒng)中肯定會(huì)涉及到更多、更復(fù)雜的學(xué)科,需要我們認(rèn)真學(xué)習(xí)和掌握的東西實(shí)在是太多了。 本軟件只是對(duì)搜索引擎功能的基本實(shí)現(xiàn),在技術(shù)方面還存在許多不足之處。當(dāng)然在這突飛猛進(jìn)的信息時(shí)代,技術(shù)的更新更是日新月異,所以其中有的思想不可能完全適應(yīng)于各種實(shí)際情況。由于本人學(xué)習(xí)軟件工程的時(shí)間不長(zhǎng),程序的設(shè)計(jì)方面不夠規(guī)范,有些簡(jiǎn)單的想法卻用了很長(zhǎng)的代碼來(lái)實(shí)現(xiàn)造成了代碼冗余 ,還有部分想法沒(méi)有實(shí)現(xiàn)。我將在今后的學(xué)習(xí)中不斷完善自己。 致謝 當(dāng)這篇論文最終完成的時(shí)候,我要向給予過(guò)我熱情幫助和悉心指導(dǎo)的老師和師兄們致以真誠(chéng)的謝意。 首先,我要感謝我的導(dǎo)師,感謝他帶給我來(lái)學(xué)習(xí)的機(jī)會(huì),感謝他對(duì)我學(xué)術(shù)上的悉心指導(dǎo),感謝他對(duì)我生活上的關(guān)懷和體貼。導(dǎo)師是不僅是我學(xué)業(yè)上的導(dǎo)師,更是生活中讓我敬佩的學(xué)者、長(zhǎng)者。給我留下深刻印象的,是他知識(shí)的淵博、治學(xué)態(tài)度的嚴(yán)謹(jǐn)、誨人不倦的學(xué)者風(fēng)范,是他謙遜待人、處處關(guān)心別人的長(zhǎng)者風(fēng)度,是他勤奮忘我的工作態(tài)度、精益求精的治學(xué)作風(fēng)。特別是老師做大事的風(fēng)范和氣度,尤其 讓我欽佩。這里我要再次感謝老師。 在本文的最后,我要再次感謝我的導(dǎo)師,同時(shí)也向與老師一樣辛勤育人,無(wú)私付出的各位導(dǎo)師、師長(zhǎng)致以深深的謝意。 參考文獻(xiàn) [1] 李曉明,悶宏飛,王繼民.搜索引擎 — — 原理、技術(shù)與系統(tǒng) [M].北京:科學(xué)出版社, 2020. [2] Heaton J.網(wǎng)絡(luò)機(jī)器人 Java 編程指南 [M].北京:電子工業(yè)出版社, 2020. [3] 崔澤永,常曉燕.搜索引擎的 Web Robot 的技術(shù)與優(yōu)化 [J].微機(jī)發(fā)展, 2020, 14(4): 100—102. [4] Shafer C.?dāng)?shù)據(jù)結(jié)構(gòu)與算法分析 (JAVA版 )[M].北京:電子工業(yè)出版社, 2020. [5]賈年.基于移動(dòng) Agent 搜索引擎的研究與實(shí)現(xiàn) [D].成都:電子科技大學(xué), 2020. [6]賈年.移動(dòng) Agent 研究 [J].四川工業(yè)學(xué)院學(xué)報(bào), 2020, 23(3): 51— 54. [7]S. Charkabarti. Mimng the Web’ s Link structure[J]. IEEE Computer, 2020, 32(8): 60— 67. [8]徐寶文,張衛(wèi)豐.搜索引擎與信息獲取技術(shù)【 M】.北京:清華大學(xué)出版社, 2020 t was not u ntil midafternoon that the inspector ambled up on his pony. My father pulled himself together, and went out to receive him。 the effor t to be even formally polite nearly strangled him. Even then the inspector was not brisk. He dis mounted in a leisurely fashion, and strolled into the house, chatting about the weather. Father, red in the face, handed him over to Mary who took him along to mother39。s room. Then followed the worst wait of all. Mary said afterwards that he hummed and ha39。d for an unconscionable time while he examined the baby in minutest detail. At last, however, he emerged, with an expressionless face. In the littleused sittingroom he sat down at the table and fussed for a while about getting a good point on his quill. At last he took a form f rom his pouch, and in a slow, deliberate hand wrote that he officially found the child to be a true female human being, free from any detectable form of deviation. He regarded that thoughtfully for some moments, as though not perfectly satisfied. He let his hand hesitate before he actually dated and signed it, then he sanded it carefully, and handed it to my enraged father, still with a faint air of uncertainty. He had, of course, no real doubt in his mind, or he would have called for another opinion。 my father was perfectly well aware of that, too. At last Petra39。s existence could be admitted. I was formally told that I had a new sister, and presently I was taken to see he r where she lay in a crib beside my mother39。s bed. She looked so pink and wrinkled to me that I did not see how the inspector could have been quite sure about her. However, there was nothing obviously wrong with her, so she had got her certificate. Nobody could blame the inspector for that。 she did appear to be as normal as a newborn baby ever looks. ... While we were taking turns to look at her somebody started to ring the stable bell in the customary way. Everyone on the farm stopped work, and very soon we were all assembled in the kitchen for prayers of thanksgiving. Two, or it may have been three, days after Petra was born I happened upon a piece of my family39。s history that I would prefer not to have known. I was sitting quietly in the room next to my parents39。 bedroom where my mother still lay in bed. It was a matter of chance, and strategy, too. It was the latest place that I