Clustering of web documents using graph representations

741 palavras 3 páginas

Resumo do Artigo: Clustering of Web Documents Using Graph Representations, Shenker A.; Bunke H.; Last M.; Kandel A., Site do Moodle da disciplina Aluno(a): Josnei Luis Olszewsky Jr. E Victor Volochtchuk de Araújo

O agrupamento (clustering) é um método que separa uma coleção de vários tipos de objetos em grupos, chamados de agrupamentos. Esse método é um processo não supervisionado, o que significa que não há exemplos para treinos . Há vários algoritmos para agrupamento de arquivos, tais como: k-means, fuzzy c-means, aglomeração hierárquica e a partição gráfica. O método de agrupamento de arquivos é uma importante área de pesquisa por duas razões principais; primeiramente, ao se agrupar vários documentos em suas categorias decorre que será mais fácil de ser procurado em algum arquivo, e também mais simples será seu uso, por segundo, o método de agrupamento, aperfeiçoa a performance de buscas e recuperação de um documento em um aglomerado de documentos. O agrupamento hierárquico, por exemplo, é usado para esse processo de busca e recuperação de arquivos em um grande número de outros tipos de arquivos, já a representação dos documentos por um vetor é mais usado, esse método é mais simples e permite o uso de métodos tradicionais de agrupamento que lidam com vetores numéricos, porém esse método descarta informações, tais como: a ordem em que o termo aparece, em que posição do documento ele está alocado e o quanto distante está um termo de outro. O problema deste método é que ele somente trabalha com vetores de características numéricas devido a necessidade de calcular distancias entre os objetos, que é feito mais facilmente com esses tipos de vetores, porém os arquivos, nesse método, necessitam ser transformados em vetores com valores numéricos, e nessa transformação podem ser descartadas informações uteis ao arquivo. Para resolver esse problema foi feita uma extensão do método clássico de

Relacionados

Engenharia de requistos
43682 palavras | 175 páginas

The Product CHAPTER OVERVIEW AND COMMENTS The goal of this chapter is to introduce the notion of software as a product designed and built by software engineers. Software is important because it is used by a great many people in society. Software engineers have a moral and ethical responsibility to ensure that the software they design does no serious harm to any people. Software engineers tend to be concerned with the technical elegance of their software products. Customers tend to be concerned….

exibir mais
Teste
21322 palavras | 86 páginas

978-0-470-48432-6 Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright….

exibir mais
Zabbix
62597 palavras | 251 páginas

hardware, servers, and web performance effectively and efficiently Rihards Olups BIRMINGHAM - MUMBAI Zabbix 1.8 Network Monitoring Copyright © 2010 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure….

exibir mais
Estudante
146595 palavras | 587 páginas

PUBLISHED BY Microsoft Press A Division of Microsoft Corporation One Microsoft Way Redmond, Washington 98052-6399 Copyright © 2010 by Microsoft Corporation All rights reserved. No part of the contents of this book may be reproduced or transmitted in any form or by any means without the written permission of the publisher. Library of Congress Control Number: 2010920178 A CIP catalogue record for this book is available from the British Library. Microsoft Press books are available through….

exibir mais
Ms application architecture
173875 palavras | 696 páginas

MICROSOFT A P P L I C AT I O N ARCH ITECTU R E GUIDE ® 2 nd Edition • • • • • • • • • • • • • • • • • • • • • • • • • • ISBN: 9780735627109 Information in this document, including URL and other Internet Web site references, is subject to change without notice. Unless otherwise noted, the example companies, organizations, products, domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious, and no association with any real company, organization, product….

exibir mais
Java + glassfish
135004 palavras | 541 páginas

Community Process Java™ EE 6 expert group. SM Antonio Goncalves Expert member of the JSR-316 (Java™ EE 6), JSR-317 (JPA 2.0), and JSR-318 (EJB™ 3.1) groups Beginning Java EE 6 Platform with GlassFish 3 ™ ™ From Novice to Professional Antonio Goncalves Beginning Java™ EE 6 Platform with GlassFish™ 3: From Novice to Professional Copyright © 2009 by Antonio Goncalves All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, electronic….

exibir mais
Lucene in action
168756 palavras | 676 páginas

Covers Apache Lucene 3.0 IN ACTION SECOND EDITION Michael McCandless Erik Hatcher , Otis Gospodnetic F OREWORD BY D OUG C UTTING MANNING Praise for the First Edition This is definitely the book to have if you’re planning on using Lucene in your application, or are interested in what Lucene can do for you. —JavaLobby Search powers the information age. This book is a gateway to this invaluable resource...It succeeds admirably in elucidating the application programming interface (API),….

exibir mais
Teste um
40208 palavras | 161 páginas

Comparing Selected Criteria of Programming Languages Java, PHP, C++, Perl, Haskell, AspectJ, Ruby, COBOL, Bash Scripts and Scheme Revision 1.0 Sultan S. Al-Qahtani Concordia University Montreal, Quebec, Canada s_alqaht@cse.concordia.ca Luis F. Guzman Concordia University Montreal, Quebec, Canada l_guzman@cse.concordia.ca Rafik Arif Concordia University Montreal, Quebec, Canada r_ar@cse.concordia.ca Adrien Tevoedjre Concordia University Montreal, Quebec, Canada a_tevoed@cse….

exibir mais
7ª conferência internacional de ciência da computação forense - icofcs 2012
33997 palavras | 136 páginas

ICoFCS 2012 PROCEEDING OF THE SEVENTH INTERNATIONAL CONFERENCE ON FORENSIC COMPUTER SCIENCE Print ISBN 978-85-65069-08-3 ICoFCS 2012 PROCEEDING OF THE SEVENTH INTERNATIONAL CONFERENCE ON By ASSOCIAÇÃO BRASILEIRA DE ESPECIALISTAS EM ALTA TECNOLOGIA - ABEAT FORENSIC COMPUTER SCIENCE 1st Edition Brasília, DF Abeat 2012 Proceeding of the Seventh International Conference on Forensic Computer Science – ICoFCS 2012 ABEAT (ed.) – Brasília, Brazil, 2012, 82 pp. – Print ISBN 978-85-65069-08-3….

exibir mais
Server 2008
151183 palavras | 605 páginas

in the writing, editing, or production (collectively “Makers”) of this book (“the Work”) do not guarantee or warrant the results to be obtained from the Work. There is no guarantee of any kind, expressed or implied, regarding the Work or its contents. The Work is sold AS IS and WITHOUT WARRANTY.You may have other legal rights, which vary from state to state. In no event will Makers be liable to you for damages, including any loss of profits, lost savings, or other incidental or consequential damages….

exibir mais

Outros Trabalhos Populares