Repository logo
 
Publication

Collecting Statistics about the Portuguese Web

dc.contributor.authorGomes, Danielpor
dc.contributor.authorSilva, Mário J.por
dc.date.accessioned2009-02-10T13:11:41Zpor
dc.date.accessioned2014-11-14T16:24:06Z
dc.date.available2009-02-10T13:11:41Zpor
dc.date.available2014-11-14T16:24:06Z
dc.date.issued2003-06por
dc.description.abstractThis report presents a characterization of text documents from the Portuguese Web. This characterization was produced from a crawl of over 4 million URLs and 131 thousand sites in 2003. We describe rules that we established for defvining its boundaries and the methodology used to gather statistics. We also show how crawling constraints and abnormal situations on the Web can influence the resultspor
dc.identifier.urihttp://hdl.handle.net/10451/14211por
dc.identifier.urihttp://repositorio.ul.pt/handle/10455/2916por
dc.language.isoporpor
dc.publisherDepartment of Informatics, University of Lisbonpor
dc.relation.ispartofseriesdi-fcul-tr-03-10por
dc.subjectWebpor
dc.subjectcharacterizationpor
dc.subjectPortuguesepor
dc.subjectPortugalpor
dc.subjecttumba!por
dc.subjectstatisticspor
dc.subjectcrawlingpor
dc.titleCollecting Statistics about the Portuguese Webpor
dc.typereport
dspace.entity.typePublication
rcaap.rightsopenAccesspor
rcaap.typereportpor

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
03-10.pdf
Size:
248.62 KB
Format:
Adobe Portable Document Format