使用协同过滤编织信息挂毯外文翻译资料

 2023-01-28 03:01

英语原文共 10 页,剩余内容已隐藏,支付完成后下载完整资料


毕业论文(设计)文献翻译

Using Collaborative Filtering To Weave An Information Tapestry

Tapestry is an experimental mail system developed at the Xerox Palo Alto Research Center. The motivation for Tapestry comes from the increasing use of electronic mail, which is resulting in users being inundated by a huge stream of incoming documents [2, 7, 12]. One way to handle large volumes of mail is to provide mailing lists, enabling users to subscribe only to those lists of interest to them. However, as illustrated in Figure 1, the set of documents of interest to a particular user rarely map neatly to existing lists. A better solution is for a user to specify ^filter that scans all lists, selecting interesting documents no matter what list they are in. Several mail systems support filtering based on a documents contents [3, 5, 6, 8]. A basic tenet of the Tapestry work is that more effective filtering can be done by involving humans in the filtering process.

In addition to content-based filtering, the 'Tapestry system was designed and built to support collaborative filtering. Collaborative filtering simply means that people collaborate to help one another perform filtering by recording their reactions to documents they read. Such reactions may be that a document was particularly interesting (or particularly uninteresting). These reactions, more generally called annotations, can be accessed by others filters. One application of annotations is in support of moderated newsgroups.

Figure i. (a) electronic mail overload; (b) using distribution lists; (c) conventional filtering; (d) collaborative filtering

Currently moderated groups have a single moderator, who selects a subset of messages to be posted to the moderated group. With annotations, a group can have many moderators, lb see the newsgroup as it would be moderated by (say) Smith, simply filter for those articles that Smith endorsed with an annotation.

Implicit feedback from users (e.g., some user sent a reply to a document) can also be utilized in the filtering process. For example, suppose you would like to receive uinterest- ing' documents from the NetNews newsgroup comp.unix-wizards in the mail, but you dont know how to write a search expression that characterizes them, and you dont have time to read them all yourself. However, you know that Smith, Jones and O*Brien read all of comp.unix- wizards newsgroup material, and reply to the more interesting documents. Tapestry allows you to filter on 'documents replied to by Smith, Jones, or OBrien.

Collaborative filtering is novel because it involves the relationship between two or more documents, namely a message and its reply, or a document and its annotations. Unlike current filtering systems, Tapestry filters cannot be computed by simply examining a document when it arrives, but rather require (potentially) repeatedly issuing queries over the entire database of previously received documents. This is because sometime after a document arrives, a human (say Smith) may read that document and decide it is interesting. At the time he replies to it (or annotates it), you want your filter to trigger and send you the original document.

Tapestry is more than a mail system, because it is designed to handle any incoming stream of electronic documents. Electronic mail is only one example of such a stream: others are newswire stories and NetNews articles [10]. Moreover, Tapestry is not only a mechanism of filtering mail, it is also a repository of mail sent in the past. Tapestry unifies ad hoc queries over this repository with the filtering of incoming data.

A typical scenario of Tapestry system usage is as follows. A user decides on lmail filtering as an area of interest. To find documents on this topic, the user issues an ad hoc query, perhaps by searching for the keyword 'filtering.' This returns too many documents. The user eventually discovers that searching, either for documents containing both information* and Altering/ or for documents containing filtering' that received at least three endorsements, works much better. Having tested this, this search is installed as a query filter, and from now on, all new documents satisfying this filter will be delivered to the users mailbox.

Architecture

Figure 2 shows the flow of documents through the major architectural components of Tapestry. These components are:

Indexer. Reads documents from external sources such as electronic mail, NetNews, or newswires and adds them to the document store. The indexer is responsible for parsing documents into a set of indexed fields that can be referenced in queries.

Document store. Provides long-term storage for all Tapestry documents. It also maintains indexes on the stored documents so that queries over the document database can be efficiently executed. The document store is append-only.

Annotation store. Provides storage of annotations associated with documents. The annotation store is also append-only.

Filterer. Repeatedly runs a batch of user-provided queries over the set of documents. Those documents matching a query are placed in the little box of the querys owner.

Little box. Queues up documents of interest to a particular user. Each user has a little box, where documents are deposited by the filterer and removed by a users document reader.

剩余内容已隐藏,支付完成后下载完整资料


资料编号:[254237],资料为PDF文档或Word文档,PDF文档可免费转换为Word

您需要先支付 30元 才能查看全部内容!立即支付

课题毕业论文、文献综述、任务书、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。