谷歌浏览器扩展课题总结外文翻译资料

 2022-12-10 04:12

Volume 6, Issue 1, January 2016 ISSN: 2277 128X

International Journal of Advanced Research in

Computer Science and Software Engineering

Research Paper

Available online at: www.ijarcsse.com

Google Chrome Extension for Topic Summarization

Shardul Katare1, Saurabh Bawdhankar2, Sourab Patil3, Aniket Deshpande4, Prof. Shrikat Nagure5

Savitribaiphule Pune University, B.E. Department of Computer Engineering, RMD Sinhgad School of Engineering,

Warje, Pune, Maharashtra, India

Abstract- Recent studies indicate that comments are an important part of any information site. They give an overall

view of the topic from the peoples perspective. People or customers response on a particular product or information

would be very useful for facilitating various surveys on a particular product or information. The information gathered

can be used for various marketing strategies, statistical analysis and research surveys. There are not many software

that are available for summarizing topics online. The proposed Chrome Extension would give the overall summary of

the topic under discussion by scanning the comments and performing sentence extraction and sentimental analysis.

The domain chosen is Google Chrome Browser as it is one of the most popular Web Browsers. Also, it is very easy to

develop Google extension as most of the instructions, software and packages for developing an extension are readily

available on internet

Keywords-Web Browser Extension, Data Mining, Software Libraries, Languages, Pattern, Sentiment analysis..

I. INTRODUCTION

The Chrome extension will summarize the information of mobiles and gadgets and any other information taking Tech

Crunch blog as the data set. Extensions are software programs that modify and improve the functionality of Chrome

browser. Extensions are known as Add-ons in Internet Explorer and Mozilla Firefox. The proposed Chrome extension

will be written using Web Technologies such as Java Scripts, HTML, and CSS. Extensions will provide functionality

without diving deeply into native code.

The extension will provide a summary about the topic which is taken as input from the site. It will also include the

summarization of comments as comments are an integral part of the topic. The benefit of this Chrome Extension is that it

will be useful for readerrsquo;s feedback as well as the sentiment analysis.

Given a topic on any information site which is under discussion and has received any comments, our model consists of

three modules – sentence detection which will split the topic post into sentences, word weighing measures weighs the

words appearing in comments and sentence selector computes the representative score for each sentence based on the

frequency of occurrence of its contained words, the model will take websites which have Facebook comments associated

with the topic, the benefit of using Facebook comments is that, the comments under the topic will not be spammed by the

malicious users and any abusive content can be reported. The Facebook APIs are readily available and can be used for

accessing the comments.

II. LITERATURE SURVEY

Data from the Comments oriented Blog Summarization Blog summarization CIKM 07:

Based on previous studies, it has been observed that comments do affect onersquo;s understanding about any topic. The

solution measured word representativeness using information hidden in comments, and then selects sentences based on

the representativeness contained in the sentences. ReQut model gives the flexibility to measure word representativeness

to three aspects which are Reader, Quotation and Topic.

copy; 2016, IJARCSSE All Rights Reserved Page | 422

Katare et al., International Journal of Advanced Research in Computer Science and Software Engineering 6(1),

January - 2016, pp. 422-425

Sentiment Analysis is the use of natural processing, text analysis along with computational linguistic to identifyand

extract subjective information in source material in general sentiment analysis gives the attitude of a speaker or a writer

with respect to some topic or the overall summary of the document.

The existing softwarersquo;s only analyze the data provided but not the comments below it as this information can be

exaggerated by the author only to convince the readers. From the recent studies it is found that users read other comments

for response of other people but it is a very lengthy and time consuming task because many sites have more than hundred

of comments.

III. PROPOSED SYSTEM

3.1 Disadvantages of Existing software:

 There are not many software which summarize the topic along with its comments.

 Blog summarization only focuses on its data provided by the author but not the comments which are given by

the users.

 It is not readily available on Chrome extension even though it is one of the most used Browsers.

3.2 Proposed System Introduction:

3.2.1 Modules:

Time Limit Module - This module enables the user to get the summary in stipulat

剩余内容已隐藏,支付完成后下载完整资料


谷歌浏览器扩展课题总结

Shardul Katare1, Saurabh Bawdhankar2, Sourab Patil3, Aniket Deshpande4, Prof. Shrikat Nagure5

Savitribaiphule Pune University, B.E. Department of Computer Engineering, RMD Sinhgad School of Engineering,Warje, Pune, Maharashtra, India

摘要

最近的研究表明,评论是任何信息网站的重要组成部分。它提供了不同人们的一个整体的视角看问题。客户对产品或客户对信息的反馈将是非常有用的,这些信息可以用于各种调查,特别是产品或信息。收集到的信息可用于定制各种营销策略,统计分析和调查研究。没有多少软件可以提供线上的分析和总结。我们推荐使用Google拓展程序,它可以通过在线扫描用户访问过的评论和话题收集信息,从而给出一个总结。选择Chrome浏览器的原因之一是因为它是当前最流行的网络浏览器之一,并且它的在线说明文档非常多,很容易就开发出他的扩展程序。

关键词:网络浏览器扩展,数据挖掘,软件库,语言,模式,情感分析

一、引言

Chrome的扩展通过各种方面收集信息,手机以及小工具和任何其他信息技术博客都将是数据收集的关键。浏览器扩展软件程序是一种增强浏览器功能的工具。比如IE和火狐浏览器的插件。它们常用一些web前端技术开发,比如JavaScript,HTML和CSS,通过这些技术使得人们不必去跟深入的了解浏览器的底层但却能开发出合适的工具。

该扩展将获取到所有网页输入的信息。进而将用户的评论信息作为主要的收集部分。这种Chrome拓展的好处是它将有助于分析读者的反馈以及情感。给定一个指定的话题,就可以收集到任何网站有关该话题的评论和意见。我们的模型包括三个模块对句子进行检测,将话题句拆分单句,统计注释和句子中出现关键词的次数,计算每个关键词或句子出现的频率,该模型将会统计使用Facebook网站评论模块的网站,使用Facebook的评论的好处是,该网站的评论不会被视为垃圾邮件以及一些其它的垃圾信息检测系统。Facebook的API是现成提供的,可用于访问评论。

二、文献调查

从评论为主的博客博总结CIKM 07数据:

根据以往的研究中,已观察到,评论会影响任何一个人对话题的理解。我们的解决方案使用隐藏在评论中的关键词,然后选择包含在句子中的关键句。请求模型中给出了衡量代表性和灵活性的三个方面就是是读者,价格和话题。

情绪分析在自然处理中使用,文字与语言的分析和计算在一般情感分析中提取源材料中的主观信息,给出了评论人或作者对于关于某些话题或文档的态度的整体摘要。

现有的软件只对数据进行分析,却不对评论进行分析,这样是不能加大对作者的说服力。从最近的研究发现,用户阅读其他评论和回应其他人的评论是一个非常漫长的和耗时的任务,因为许多网站有超过100评论。

三、提出系统

3.1现有软件的缺点:

没有多少软件,同时总结话题和评论。

博客摘要只专注于作者所提供的数据,而不是给出的用户评论。

即使Chrome是最常用的浏览器之一,它也很少有该方面的扩展。

3.2建议系统介绍:

3.2.1模块:

时间限制模块-这个模块可以让用户在规定的时间内得到汇总。

话题模块,该模块扫描作者或话题中的发布者所给出的所有信息。

句子检测模块-这个模块将所有的句子划分为话题。

3.2.2系统结构

3.2.3算法

1.开始

2.打开谷歌浏览器

3.打开TechCrunch网站

4.在网站上搜索话题

5.打开话题

6.启动浏览器扩展

7.如果(未发现评论)“没有可用的意见”,如果(发现评论)“打开浏览器扩展”

8.点击“得到总结”按钮

9.总结将显示。

10.“下载为PDF”按钮将下载的PDF格式的总结

3.2.4可行性研究:

P类问题—

如果运行时间是一些多项式函数的输入的大小,例如,如果该算法在线性运行

时间或立方时间,然后我们说,该算法在多项式时间内运行,它解决的问题是在P[ 3 ]

利用输入可跟踪和易于求解的问题,可以解决的问题(多项式时间)类问题。

NP-型问题—

一个问题是分配到NP(非确定性多项式时间)类如果是在多项式时间内可解的

非确定性图灵机。一个也总是NP.[ 4 ]

考虑到上面提到的点和后,我们的项目算法的分析,可以推断出

问题是一个P型问题。

3.2.5数学模型:

1. Let S be a system that describes the execution of the Extension.S = {hellip;..}

2. Identify the modules as M

S= {M ...}

Where,

M = {E, R}

E = selection of topic given by author.

R = selection of comments under the topic.

3. Identify the modules of R as Mr

Mr = {Tl, T0, FC, SD}

Where,

Tl= Time limit module.

To= Topic module.

SD=Sentence module.

FC=Facebook comments module..

[A] Input to Tl is T_limit

Where,

T_limit = Time limit specified for the module.

[B] Input to is topic given by the author

4. Identify the Processes as P

S = {M, P hellip;}

P = {Pt, Pr, Pl, Pcl, P_rep, Pe}

Where,

Pt = Process of evaluating time.

Pr = the readers authority is calculated as the number of distinct user that post a reply to the users comment.

Pl = number of likes on the comment

Pcl = Algorithm that clusters the comments and a weight is assigned to a cluster

P_rep = the number of replies a comment has got = Rep (Ci)

Pe = All the named entities of topics are identified.

5. Identify the output as O.

S = {M, P, O...}

O = {Os, Op}

Where,

Os = Output is summary of the topic and comments

Ow = Download as PDF option is available.

6. Identify the success as Su.

S = {M, P, O, Su...}

Where,

Su =Success is when the summary is successfully displayed in the extension

7. Identify the failure as F.

S = {M, P, O, Su, F...}

Where,

F = When nothing is displayed in the extension.

四、特征

bull;扩展简单易用非技术用户界面。

从数以千计的评论中提取有价值的信息。

获得客户的产品和意见的总体思路。

从评论中提取情绪。

bull;“下载为PDF”将下载的PDF格式提取信息记录

3.2.5该系统优势:

该扩展可以安装在一台电脑或笔记本电脑中,安装谷歌浏览器用于总结在TechCrunch网站网站的话题。

该插件将减少的时间和所需的努力读趋势话题的评论。

该插件将提供的话题,人们的意见,大多数博客的整体描述将不考虑。

该插件具有一个非常友好的用户界面,这使得它可以被任何的用户使用。

bull;源代码可以进一步修改,扩展到其它的浏览器,如Mozilla Firefox,Internet Explorer和 Opera。

源代码可以进一步修改,以便它可以从更多的网站收集用户评论。

五.结论

我们所提出的解决方案给出了一个很好的为长博客帖子和评论的总结。它有助于轻松阅读的总结,获取阅读中的所有评论。该拓展还可以拓展到其他的浏览器比如火狐,IE。

参考:

[1] Google Chrome official website https://www.google.com/chrome/

[2] Google Chrome Extensions https://chrome.google.com/webstore/category/extensions

[3] Feasibility Study - P type problems https://www.quora.com/What-are-P-NP-NP-complete-and-NP-hard

[4] Feasibility Study - NP type problems http://mathworld.wolfram.com/NP-Problem.html

[5] TechCrunch website as dataset http://techcrunch.com/

[6] Sentiment analysis https://en.wikipedia.org/wiki/Sentiment_analysis

[7] Comment oriented blog summarization CIKMrsquo;07, November 6–8, 2007, Lisboa, Portugal. Copyright 2007

ACM 978-1-59593-803-9/07/0011

[8] “Mining the User Clusters on Facebook Fan Pages Based on Topic and Sentiment,” IEEE IRI 2014, August 13-

15, 2014, San Francisco, California, USA.

[9] Google Chrome Developer https://developer.chrome.com/extensions/getstarted

[10] Make Google Chrome extension https://chrome.google.com/webstore/category/extensions

[11] Natural language processing https://en.wikipedia.org/wiki/Natural_language_processing

[12] Web mining http://www.scaleunlimited.com/about/web-mining/

剩余内容已隐藏,支付完成后下载完整资料


资料编号:[31102],资料为PDF文档或Word文档,PDF文档可免费转换为Word

您需要先支付 30元 才能查看全部内容!立即支付

课题毕业论文、文献综述、任务书、外文翻译、程序设计、图纸设计等资料可联系客服协助查找。