freepeople性欧美熟妇, 色戒完整版无删减158分钟hd, 无码精品国产vα在线观看DVD, 丰满少妇伦精品无码专区在线观看,艾栗栗与纹身男宾馆3p50分钟,国产AV片在线观看,黑人与美女高潮,18岁女RAPPERDISSSUBS,国产手机在机看影片

正文內(nèi)容

textmining文本挖掘課件-免費(fèi)閱讀

  

【正文】 economic details of the shared currency。 ? calcium channel blockers prevent some migraines ? Magnesium is a natural calcium channel blocker。 ? Discovery of knowledge previously unknown to the user in text。 reports on the investigation following the crash。 900 questions ? Technique doesn?t do too well (though would have placed in top 9 of ~30 participants!) ? MRR = (., right answer ranked about 45 on average) ? Why? Because it relies on the enormity of the Web! ? Using the Web as a whole, not just TREC?s 1M documents… MRR = (., on average, right answer is ranked about 23) Issues ? In many scenarios (., monitoring an individual?s …) we only have a small set of documents ? Works best/only for “Trivial Pursuit”style factbased questions ? Limited/brittle repertoire of ? question categories ? answer data types/filters ? query rewriting rules ISI: Surface patterns approach ? Use of Characteristic Phrases ? When was person born” ? Typical answers ? Mozart was born in 1756.” ? Gandhi (18691948)...” ? Suggests phrases (regular expressions) like ? NAME was born in BIRTHDATE” ? NAME ( BIRTHDATE” ? Use of Regular Expressions can help locate correct answer Use Pattern Learning ? Example: ? “ The great poser Mozart (17561791) achieved fame at a young age” ? “ Mozart (17561791) was a genius” ? “ The whole world would always be indebted to the great music of Mozart (17561791)” ? Longest matching substring for all 3 sentences is Mozart (17561791)” ? Suffix tree would extract Mozart (17561791) as an output, with score of 3 Pattern Learning (cont.) ? Repeat with different examples of same question type ? “ Gandhi 1869” , “ Newton 1642” , etc. ? Some patterns learned for BIRTHDATE ? a. born in ANSWER, NAME ? b. NAME was born on ANSWER , ? c. NAME ( ANSWER ? d. NAME ( ANSWER ) Experiments ? 6 different Q types ? from Webclopedia QA Typology (Hovy et al., 2020a) ? BIRTHDATE ? LOCATION ? INVENTOR ? DISCOVERER ? DEFINITION ? WHYFAMOUS Experiments: pattern precision ? BIRTHDATE table: ? NAME ( ANSWER ) ? NAME was born on ANSWER, ? NAME was born in ANSWER ? NAME was born ANSWER ? ANSWER NAME was born ? NAME ( ANSWER ? NAME ( ANSWER ? INVENTOR ? ANSWER invents NAME ? the NAME was invented by ANSWER ? ANSWER invented the NAME in Experiments (cont.) ? DISCOVERER ? when ANSWER discovered NAME ? ANSWER39。 ? Migraine patients have high platelet aggregability。Outline of Today ? Introduction ? Lexicon construction ? Topic Detection and Tracking ? Summarization ? Question Answering Data Mining Market Basket Analysis ? 80% of the people who buy milk also buy bread ? On Friday’s, 70% of the men who bought diapers also bought beer. ? What is the relationship between diapers and beer? ? Walmart could trace the reason after doing a small survey! The business opportunity in text mining? 0102030405060708090100D a ta vo l u m e M a r k e t Ca pU n s tr u c tu r e dS tr u c tu r e dCorporate Knowledge “ Ore” ? Email ? Insurance claims ? News articles ? Web pages ? Patent portfolios ? IRC ? Scientific articles ? Customer plaint letters ? Contracts ? Transcripts of phone calls with customers ? Technical documents Stuff not very accessible via standard datamining Text Knowledge Extraction Tasks ? Small Stuff. Useful nuggets of information that a user wants: ? Question Answering ? Information Extraction (DB filling) ? Thesaurus Generation ? Big Stuff. Overviews: ? Summary Extraction (documents or collections) ? Categorization (documents) ? Clustering (collections) ? Text Data Mining: Interesting unknown correlations that one can discover Text Mining ? The foundation of most mercial “ text mining” products is all the stuff we have already covered: ? Information Retrieval engine ? Web spider/search ? Text classification ? Text clustering ? Named entity recognition ? Information extraction (only sometimes) ? Is this text mining? What else is needed? One tool: Question Answering ? Goal: Use Encyclopedia/other source to answer “Trivial Pursuitstyle” factoid questions ? Example: ? “What famed English site is found on Salisbury Plain?” From Another tool: Summarizing ? Highlevel summary or survey of all main points? ? How to summarize a collection? ? Example: ? sentence extraction from a single document IBM Text Miner terminology: Example of Vocabulary found ? Certificate of deposit ? CMOs ? Commercial bank ? Commercial paper ? Commercial Union Assurance ? Commodity Futures Trading Commission ? Consul Restaurant ? Convertible bond ? Credit facility ? Credit line ? Debt security ? Debtor country ? Detroit Edison ? Digital Equipment ? Dollars of debt ? EndMarch ? Enserch ? Equity warrant ? Eurodollar ? ? What is Text Data Mining? ? Peoples’ first thought: ? Make it easier to find things on the Web. ? But this is information retrieval! ? The metaphor of extracting ore from rock: ? Does make sense for extracting documents of interest from a huge pile. ? But does not reflect notions of DM in practice. Rather: ? finding patterns across large collections ? discovering heretofore unknown information Definitions of Text Mining ? Text mining mainly is about somehow extracting the information and knowledge from text。 ? High levels of magnesium inhibit SCD。ve Bayes ? Can rank sentences according to score and show top n to user. Evaluation ? Compare extracted sentences with sentences in abstracts Evaluation of features ? Baseline (choose first n sentences): 24% ? Overall performance (4244%) not very good. ? However, there is more
點(diǎn)擊復(fù)制文檔內(nèi)容
公司管理相關(guān)推薦
文庫(kù)吧 www.dybbs8.com
備案圖鄂ICP備17016276號(hào)-1