freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

外文文獻(xiàn)翻譯中英文對(duì)照計(jì)算機(jī)科學(xué)與技術(shù)預(yù)處理和挖掘web日志數(shù)據(jù)網(wǎng)站個(gè)性化-資料下載頁(yè)

2024-12-06 05:20本頁(yè)面

【導(dǎo)讀】摘要:我們描述了Web使用挖掘活動(dòng)的一個(gè)持續(xù)項(xiàng)目要求,我們叫它ClickWorld3,器通過(guò)數(shù)據(jù)和Web挖掘技術(shù)的功能。提取的知識(shí)是部署的個(gè)性化和主動(dòng)提供網(wǎng)絡(luò)服。網(wǎng)頁(yè);第二,試圖預(yù)測(cè)是否用戶可能有興趣參觀的一部分網(wǎng)頁(yè)。結(jié)構(gòu)挖掘旨在發(fā)掘基本的拓?fù)浣Y(jié)構(gòu)的互連,籌措之間的網(wǎng)絡(luò)對(duì)象??捎糜诜诸惡团琶木W(wǎng)站,并發(fā)現(xiàn)了它們之間的相似性。站收集和預(yù)處理訪問(wèn)日志,花費(fèi)的時(shí)間為5個(gè)月。該網(wǎng)站包括了民族。的時(shí)間表,ECC等。第一個(gè),一項(xiàng)旨在提取一。第二次試驗(yàn)的目的是提取一分。驗(yàn)證后,一個(gè)新的Cookie發(fā)送到用戶的瀏覽器。戶只要她刪除的Cookie的體系。此外,如果用戶注冊(cè),該協(xié)會(huì)登錄cookie是可以。在輸入數(shù)據(jù),然后可以跟蹤用戶后,還原她刪除的cookie.這種機(jī)制使檢測(cè)非人類的用戶,如系統(tǒng)診斷診斷和監(jiān)測(cè)方案。這不僅是可能的,如果用戶是一些程序,自動(dòng)刪除指定的cookie,聞,財(cái)政,照片,笑話,購(gòu)物。論壇,酒吧;第二個(gè)級(jí)別的

  

【正文】 ile name 1,3478,|DX,00 contains a code for the local web site (1 stands for ), a web page id (3478) and its specific parameters (DX). The form above has been designed for excient machine processing. For instance, the web page id is a key for a database table where the page template is found, while the parameters allow for retrieving the web page content in some other table. Unfortunately, this is a nightmare when mining clickstream of URLs. Syntactic features of URLs are of little help: we need some semantic information,or ontology [5,13], assigned to URLs. At the best, we can expect that an applicationlevel log is available, . a log of accesses to semanticrelevant objects. An example of applicationlevel log is one recording that the user entered the site from the home page, then visited a sport page with news on a soccer team, and so on. This would require a system module monitoring user steps at a semantic level of granularity. In the ClickWorld project such a module is called Click Observe. Unfortunately , however, the module is a deliverable of the project, and it was not available for collecting data at the beginning of the project. Therefore, we decided to extract both syntactic and semantic information from URLs via a semiautomatic approach. The adopted approach consists in reverseengineering URLs, starting from the web site designer description of the meaning of each URL path, web page id and web page parameters. Using a PERL script, starting from the designer description we extracted from original URLs the following information: {local web server (., or etc.), which provides us with some spatial information about user interests。{ a firstlevel classification of URLs into 24 types, some of which are: home , news, finance, photo galleries, jokes, shopping, forum, pubs。{ a secondlevel classification of URLs depending on the firstlevel one, classified as shopping may be further classified as book shopping or pcshopping and so on。{ a thirdlevel classification of URLs depending on the secondlevel one, classified as book shopping may be further classified as programming book shopping or narrative book shopping and so on。{ a parameter information, further detailing the three level classification, classified as programming book shopping may have the ISBN book code as parameter。 { the depth of the classification, . 1 if the URL has only a firstlevel classification, 2 if the URL has first and secondlevel classification, and so on .Of course, the adopted approach was mainly an heuristics one, with the hierarchical ontology designed at posteriori. Also, the designed ontology does not exploit any contentbased classification, . the description of an elementary object such as sport news with id 12345 is its code (., firstlevel is news, second level is sport, parameter information 12345), with no reference to the content of the news (was the news reporting about any specific player?).
點(diǎn)擊復(fù)制文檔內(nèi)容
研究報(bào)告相關(guān)推薦
文庫(kù)吧 www.dybbs8.com
備案圖鄂ICP備17016276號(hào)-1