Case Study: Assigning genre labels to random web pages

From WebGenreWiki

Jump to: navigation, search

Survey (See Comments and Results)

Labels assigned by 8 authors to 50 random web pages:

  • Annotators:
  1.   Rehm       
  2.   Santini    
  3.   Braslavski 
  4.   Stubbe     
  5.   Symonenko  
  6.   Tavosanis  
  7.   Vidulin    
  8.   Sharoff (book chapter, for categories see http://corpus.leeds.ac.uk/serge/webgenres/ )
  • Assessment:
 High Consistency in Label Assignments (5 or more annotators used the same label)
 Medium Consistency in Label Assignments (3 or 4 annotators used the same label)
 Low Consistency in Label Assignments (2 or less annotators used the same label)
  Low Consistency in Label Assignments
 
  1. Entry page of a digital information desk service
  2. query page
  3. FAQ, online help
  4. FAQ
  5. homepage of an answer community portal
  6. Homepage
  7. Community, Index
  8. Information/Linkerie
  Medium Consistency in Label Assignments (FAQ)
 
  1. Article          
  2. how-to           
  3. instruction, FAQ 
  4. FAQ              
  5. faq              
  6. Explanation      
  7. Informative      
  8. Instruction
  Medium Consistency in Label Assignments (newspaper)
 
  1.  Department entry page of an online newspaper 
  2.  online newspaper article snippet
  3.  newspaper, portal
  4.  Newspaper Homepage
  5.  a newspaper section
  6.  Homepage
  7.  Journalistic, Content Delivery
  8.  Information/Linkerie
  Low Consistency in Label Assignments
 
  1.  Entry page of the website of an academic organization
  2.  introductory page
  3.  link collection
  4.  Bibliography/List of Books
  5.  an entry page to a special digital library collection
  6.  Homepage
  7.  Index, Commercial/Promotional
  8.  Information/Linkerie
  Medium Consistency in Label Assignments (directory)
 
  1.  Index of a web catalogue 
  2.  directory
  3.  web directory (a special case of link collection)
  4.  Linklist
  5.  a web directory section
  6.  List of resources
  7.  Index
  8.  Information/Linkerie
  Medium Consistency in Label Assignments (article)
 
  1.  Article 
  2.  car review/ article
  3.  article
  4.  Advertisment/Reportage
  5.  a new product infomercial
  6.  Product review
  7.  Journalistic
  8.  Discussion
  Medium Consistency in Label Assignments (entry)
 
  1.  Entry in an online encyclopedia 
  2.  wikipedia entry
  3.  wikipedia entry (informative)
  4.  Encyclopedia
  5.  a Web 2.0 encyclopedia entry
  6.  Explanation
  7.  Informative
  8.  Discussion
  Low Consistency in Label Assignments
 
  1.  Entry page of a sports news website 
  2.  online sport newspaper front page
  3.  newspaper, portal
  4.  Crashed my Browser. News?
  5.  a homepage of a sports portal
  6.  Homepage
  7.  Journalistic, Index
  8.  Information
  Low Consistency in Label Assignments
 
  1.  Research article 
  2.  citation with access option
  3.  reference, informative
  4.  Scientific Article
  5.  an entry page to an academic article available on subscription basis
  6.  Academic paper
  7.  Scientific
  8.  Discussion
  Medium Consistency in Label Assignments (homepage)
 
  1.  Entry page of a finance news website 
  2.  blog entries with mini-surveys
  3.  newspaper, portal
  4.  Newspaper Homepage: News, Polls, Statisitics
  5.  an homepage of a finance portal
  6.  Homepage
  7.  Journalistic, Index, User Input
  8.  Information/Linkerie
  Low Consistency in Label Assignments
 
  1.  Department entry page of a news website 
  2.  composite informational 
  3.  newspaper, portal
  4.  Newspaper Homepage
  5.  a topic-specific section of an information portal
  6.  Homepage
  7.  Journalistic, Index
  8.  Information/Linkerie
  Low Consistency in Label Assignments
 
  1.  Entry page of the website of a research journal 
  2.  composite informational 
  3.  newspaper, portal
  4.  "About-page"
  5.  a homepage of a subscription-based academic journal
  6.  Homepage
  7.  Commercial/Promotional
  8.  Information
  Low Consistency in Label Assignments
 
  1.  Entry page of the website of a research journal 
  2.  table of contents with snippets
  3.  portal, link collection
  4.  Bibliography/List of Articles
  5.  a homepage of a subscription-based academic journal
  6.  Homepage
  7.  Index, Content Delivery
  8.  Information
  Medium Consistency in Label Assignments (article)
 
  1.  News article 
  2.  article preview with commercial
  3.  newspaper
  4.  News
  5.  an entry page to a free article in an newspaper 
  6.  News article
  7.  Journalistic, Content Delivery
  8.  Discussion
  Medium Consistency in Label Assignments (academic paper)
 
  1.  Research article 
  2.  academic paper
  3.  scientific paper
  4.  Scientific Paper
  5.  an academic article
  6.  Academic paper
  7.  Scientific
  8.  Discussion
  Low Consistency in Label Assignments
 
  1.  Feature article 
  2.  tv program presentation
  3.  newspaper
  4.  "About-page"
  5.  a tv guide page for a particular program
  6.  Explanation
  7.  Journalistic
  8.  Information/Linkerie
  Low Consistency in Label Assignments
 
  1.  Discussion forum 
  2.  blog post
  3.  forum
  4.  Bulletin Board
  5.  a thread in a discussion board
  6.  Discussion
  7.  Community
  8.  Discussion
  Medium Consistency in Label Assignments (shop)
 
  1.  Product description 
  2.  bookshop citation
  3.  product page, eshop
  4.  Shop
  5.  a product page in an online bookstore
  6.  Product review
  7.  Shopping
  8.  Information
  Medium Consistency in Label Assignments (advertisement)
 
  1.  Product feature advertisement 
  2.  infomrational+advertising web page
  3.  product page, promotion
  4.  Advertisment/Information
  5.  an ad/infomercial
  6.  Product information
  7.  Commercial/Promotional
  8.  Propaganda
  Medium Consistency in Label Assignments (shop)
 
  1.  Entry page of a webshop 
  2.  shopping catalogue
  3.  eshop
  4.  Shop
  5.  a homepage of a mega-shopping site
  6.  Homepage
  7.  Index
  High Consistency in Label Assignments (article)
 
  1.  News article 
  2.  article/commentary
  3.  article
  4.  Feature
  5.  a newsletter article
  6.  News article
  7.  Journalistic
  Medium Consistency in Label Assignments (list)
 
  1.  Listing of discussion forums 
  2.  index/search page/post snippets
  3.  portal
  4.  Linklist+Bulletin Board
  5.  a homepage of a women's community site
  6.  List of resources
  7.  Index, Community
  Medium Consistency in Label Assignments (search page/engine)
 
  1.  Entry page of a job search engine 
  2.  search page (jobs)
  3.  a specialized search engine, online service
  4.  Search
  5.  a homepage of a job portal
  6.  Homepage
  7.  User Input, Index
  Low Consistency in Label Assignments
 
  1.  List of news items 
  2.  composite subwebsite entry page
  3.  newspaper
  4.  News
  5.  an entry page to a shopping section in an online newspaper
  6.  Journalistic [supergenre]
  7.  Journalistic
  High Consistency in Label Assignments (article)
 
  1.  Feature article 
  2.  article (explanatory)
  3.  article
  4.  Feature
  5.  an article in a magazine
  6.  News article
  7.  Journalistic
  Low Consistency in Label Assignments
 
  1.  List of discussion forums 
  2.  ranked toc
  3.  forum
  4.  Bulletin Board
  5.  a list of threads in a discussion board
  6.  Discussion
  7.  Community
  High Consistency in Label Assignments (patent)
 
  1.  Patent specification 
  2.  toc with snippets
  3.  patent, close to scientific paper
  4.  Patent
  5.  a patent page
  6.  Law
  7.  Scientific
  Medium Consistency in Label Assignments (search (page/engine))
 
  1.  Department entry page of a webshop 
  2.  search page with tips
  3.  newspaper, portal
  4.  Search+Information
  5.  an entry page to a custom insurance quote
  6.  Search
  7.  Commercial/Promotional, Journalistic, Index
  Medium Consistency in Label Assignments (proposal)
 
  1.  Project proposal 
  2.  proposal
  3.  formal document, somewhat close to scuentific paper 
  4.  Proposal+Lexicon
  5.  a publication proposal
  6.  Proposal
  7.  Scientific
  Medium Consistency in Label Assignments (search (page/engine))
 
  1.  Lonely hearts ad search engine 
  2.  search page
  3.  online service
  4.  Search
  5.  a dating page in a newspaper
  6.  Search
  7.  User Input
  Medium Consistency in Label Assignments (list)
 
  1.  Entry page of the website of a collection 
  2.  index
  3.  link collection
  4.  Linklist
  5.  an entry page to a special digital library collection
  6.  List of resources
  7.  Index
  High Consistency in Label Assignments (blog)
 
  1.  Blog 
  2.  blog
  3.  blog
  4.  Reportage/Diary
  5.  a blog's homepage
  6.  Blog
  7.  Blog, Personal
  Low Consistency in Label Assignments
  1.  List of musicians and bands 
  2.  home page
  3.  portal
  4.  Homepage/Linklist
  5.  an entry page to an topic-specific online community
  6.  List of resources
  7.  Content Delivery, Index
  Medium Consistency in Label Assignments (list)
 
  1.  List of games 
  2.  entry page subwebsite+catalogue
  3.  portal
  4.  Homepage/Linklist
  5.  an entry page to an online gaming community
  6.  List of resources
  7.  Index
  High Consistency in Label Assignments (article)
 
  1.  News article 
  2.  articles + boxes for comments
  3.  article
  4.  Reportage
  5.  an article in a magazine
  6.  News article
  7.  Journalistic
  Medium Consistency in Label Assignments (search (page/engine))
 
  1.  Department entry page of an online newspaper 
  2.  entry page subwebsite+search page
  3.  newspaper, portal
  4.  Search+Information
  5.  a topic-specific (real-estate?) section in a newspaper
  6.  Search
  7.  User Input, Shopping, Journalistic
  Low Consistency in Label Assignments
 
  1.  List of job advertisements 
  2.  infomrational
  3.  don't know
  4.  Job Offers
  5.  a job ad page in a local government's website
  6.  List of resources
  7.  Commercial/Promotional
  Low Consistency in Label Assignments
 
  1.  Entry page of a travel/flight search engine 
  2.  search page (orbitz)
  3.  online service
  4.  Shop
  5.  an entry page to a travel booking website
  6.  Homepage
  7.  User Input
  Medium Consistency in Label Assignments (article)
 
  1.  Technical Article 
  2.  scientific article (contiene forumule)
  3.  article
  4.  Instruction + Code Listing
  5.  an article in a personal website
  6.  Essay
  7.  Scientific
  Low Consistency in Label Assignments
 
  1.  Guidelines 
  2.  instructional with index (code of practice)
  3.  official document
  4.  Instruction
  5.  a section in an academic procedures (quality & standards?) document
  6.  Law
  7.  Official
  Low Consistency in Label Assignments
 
  1.  List of publications 
  2.  references (with explanation)
  3.  scientific paper
  4.  Scientific Paper+Bibliography
  5.  a brief review of a topic-specific research
  6.  Academic paper
  7.  Scientific
  Medium Consistency in Label Assignments (article)
 
  1.  Feature article 
  2.  article
  3.  redirected, don't know
  4.  Reportage
  5.  an article in a magazine
  6.  News article
  7.  Journalistic
  Medium Consistency in Label Assignments (informative/information)
 
  1.  Out of service notification 
  2.  list of references
  3.  informative, don't know
  4.  Error Message
  5.  a research project information
  6.  Product information
  7.  Informative, Index
  Medium Consistency in Label Assignments (paper)
 
  1.  Out of service notification 
  2.  abstract
  3.  scientific paper
  4.  Scientific Paper
  5.  an entry page to an academic article available on subscription basis
  6.  Academic paper
  7.  Scientific
  High Consistency in Label Assignments (article)
 
  1.  News article 
  2.  article
  3.  article
  4.  Reportage
  5.  an article in a newspaper
  6.  News article
  7.  Journalistic
  Low Consistency in Label Assignments
 
  1.  List of educational material/teaching material 
  2.  lesson/search page/list of links with lessons
  3.  link collection? don't know
  4.  Lessons
  5.  an entry page to a lesson plans' section
  6.  Product information
  7.  Scientific, Content Delivery
  Low Consistency in Label Assignments
 
  1.  Index of a web catalogue 
  2.  search+ catalogue
  3.  link collection
  4.  Linklist
  5.  a section in a web directory
  6.  List of resources
  7.  Commercial/Promotional
  Medium Consistency in Label Assignments (search (page/engine))
 
  1.  Entry page of a yellow pages search engine 
  2.  search
  3.  online service
  4.  Search
  5.  an entry page in yellow pages
  6.  Search
  7.  User Input
  Low Consistency in Label Assignments
 
  1.  Guidelines 
  2.  short introduction
  3.  informative, don't know
  4.  Glossary
  5.  This document is not based on a genre. Or, it is a subgenre - like a legend key (an explanation of a particular sign)
  6.  Law
  7.  Official
  Medium Consistency in Label Assignments (informational/information page)
 
  1.  Instructions 
  2.  instructional+informational
  3.  product page, promotion
  4.  Advertisment
  5.  a service information page
  6.  Product information
  7.  Commercial/Promotional

Comments