Naming conventions

From WebGenreWiki

Revision as of 12:41, 23 June 2008 by Serge925 (Talk | contribs)
(diff) ←Older revision | Current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search

Chosing and Naming Categories

As a purely practical matter, we need to agree on a naming convention for our genres - for example, "journalism" or "journalistic" or "journalistic material" (at the moment we have all of them in our list)

  • Categories should be useful for search tasks
  • Should they cover every document vs. cover useful documents vs. cover only text-documents?

Mitja: Useful only - otherwise we will spend a lot of time agonizing over what genre some page belongs to, even though nobody will ever want to know about that page. Useless pages are particularly tricky to classify. A useful page is a page that someone will conceivably search for. Or is it too difficult to decide what is useful?

  • Avoid "Other" Category

Mitja: We should have a "useless" category, but no "useful other".

  • don't mix with topic, authorship, private vs. public etc.
  • Naming: no mixture of adjectives and nouns (Informative vs. Gateway). Prefer nouns.
  • Or: Nouns for the bottom-level genres ("news", "reportage", "interview") and adjectives for the supergenres ("journalistic").

Mitja: Probably best to stick to nouns whenever possible. If we really cannot come up with a good noun, we use whatever appropriate.

  • I also suggest we avoid superfluous words like "material".
  • Can we simply call genre genre? Or is there any good reason for calling them (genre) categories?

Marina (7 Dec. 2007): The issue of "genre names" is a very delicate one. I suggest reading Görlach M. (2002). “What’s in a Name? Terms Designating Text Types and the History of English” [1] and Görlach (2004: 9) [2], where he says "In sum, we can state that when text types become conventionalized, the need for specific designation arises. These names will be in the form of new items - ad-hoc compounds or paraphrases which in due course will become lexicalized - or consist in existing terms applied to the new text-specific context, or derived by metonymy from the object on which the text is placed or (more frequently) the name of an action/activity being transferred to the result in form of a written text, and finally speech acts coming to be also used as text types. While these restrictions are comparatively easy to formulate in principle, they are very difficut to apply to specific instances. This becomes quite evident when a list of potentially relevant terms and their distinctive feature is put to test [my emphasis]."
I think we have all experienced this. I feel quite uncomfortable with genre names like "portrayal" (Meyer zu Eissen and Stein, 2004) or "search start" (Rosso, 2005). Yes, they can be taken as descriptive designations, but are they "genre names"? I would say: no. In all our genre palettes, we mostly follow our instinct. I do not think this is enough. This has led us to the present chaotic situation. Swales makes interesting remarks about genre names in a section with the title "A discourse community's nomenclature for genres is an important source of insight" (Swales, 1990: 54-57). But then in his genre definition he states: "The genre names inherited and produced by discourse communities and imported by others constitute valuable ethnographic communication, but typically need further validation" (Swales, 1990: 58).
As far as I know, Görlach is the only one who has tried to draw up an inventory of existing genres for English (Görlach, 2004: 24-88). Unfortunatly this inventory does not include any web genres. I could detect a single electronic genre, i.e. email (Görlach, 2004: 41).
In conclusion, there are many useful categories, they are not all necessarily genres. Who decides what is a genre and what is not a genre? Genre analysts, the users, IR practictioners?
Andrea and Mitja, I understand your eagerness to identify a bunch of so-called useful genre categories, but it seems to me that we have not addressed a few core questions:

  • how can be sure that a category is a genre and not something else?
  • how can be sure that a category name is a genre name?

Personally, I am not happy with a pure bottom-up approch (Rosso), nor with a top-down approach (Andrea, Mitja), nor with a mix of bottop-up and top-down approach (Meyer zu Eissen and Stein). In his palette, Rosso includes different granularity of genres, and some non-genre categories (images or search start). My suspicion is that a classifier can be taken aback by this heterogeneity. Mitja/Vedrana's genre palette is a mixuture of topical & non-topical categories at different level of granularity. Andrea's palette is hierarchical, which is good so it can cover genres and supergenres, but again the genre names or the categories are debatable: I was quite unhappy with the category "Nothing" and with "Marginal Note" (sorry, I do not know what a marginal note is). Also, a supergenre "Documentation" including a genre "law" ... uhm... does not look very convincing.
I definetely think that a genre palette needs validation from the bottom, i.e. the users (after all, we are nothing more that a bunch of genre geeks), but validation is difficult to design. Rosso has started the discussion on this. I think we should build on his experience and try to refine it.

Andrea: I'd propose that we first try to create a list of genres - without thinking too much, just brainstorming. In a following discussion we can try to filter those classes and chose nice names. This is already a limited "user validation". Later we can try to find more users, build a sample genre enabled search application or start an online survey. I think we shouldn't start a discussion about how difficult this is but simply try to do something! Maybe we'll find out that in the end it wasn't that difficult at all. We have a lot of resources to start with: user studies, our own research, work from others.
I also think the question "how can be sure that a category name is a genre name?" is way too philosphical. Why don't we just use names that seem to fit and that everyone (or at least everyone of us) is able to understand? I'm not really a linguist, so I guess I'm ignorant enough to be an example for the average web user :)

Number of Genres, Hierarchy

  • Genres can be narrower or broader: they range from genres like "news" to supergenres like "journalism", forming hierarchies based on a subcategory IS A supercategory relation, like "news IS journalism". A question we need to answer is whether our hierarchy should be a tree - "weather report", for example, can be "information" and "journalism". I think we should stick to a tree if at all possible.
  • Hierachical (IS A) organisation - use a two level hierarchy

Mitja: The lowest level is very fine-grained (concrete genres). The next level is a bit more abstract and contains the right number of genres for our corpus. We can add more levels if appropriate, but they are not essential.

  • Granularity - It's important not to mix granularity on one level of the hierarchy. For example "Informative material" and "FAQ".
  • Number of Genres - between 10 and 50. More are confusing.

Hierachical (CONTAINS) Organisation

  • We have genre on different units: parts of page, single page, page collection (websites)
  • Therefore we may need multiple sets of categories for each unit, together with a CONTAINS relation between genres.

Andrea: There are only two levels: container-genres and the genres contained. These two classes do not necessarily appear on different units. You can have a collection at page- or at site-level or even have, for example, one novel being split over multiple pages.

Andrea: Examples are

  • container: blog. contains (for example): poem, news, code listing
  • container: scientific paper. contains: statistics, bibliography
  • an offline example: container: newspaper. contains: interview, news, reportage, portrait, ...

Marina (7 Dec. 2007): I think we can distinguish among different types of complex organization. We can have stuctural hierarchy. For instance, a website contains a web page, which, in turn, contains a web page section/portion. Genre can be instantiated at any of these levels. Consequently, we have different unit of genre analysis.
At level of genre analysis, we can have bound and free genres (Görlach, 2004: 106). For instance, a headline or a footnote are always part of another encompassing genre (e.g. a newspaper article). I think we all want to work with free genres. There also conglomerate genres (I am not sure yet if this is the same as "supergenre", I do not think so...) (Görlach, 2004: 106). For example a "newspaper" is made up of editorials, comments, sports reports, weather forcasts, classifieds, obituaries, etc.

Andrea: What we did call "supergenres" here refers to the "is a" relation (hyponomy), "conglomerate genres" refer to the "contains" relation (meronomy).