A VERY short introduction to document analysis.
STEP 1. Know everything. Understand everything ...
STEP 2. Give up on that. Acknowledge your limitations.
Infinite and wonderful variety in books.
Equally infinite choices in what to markup. Markup is always interpretation.
- You can markup what is there.
- You can markup what you choose to find there
- You can markup what is only implicitly there (tagging silences)
One can try to distinguish between categories of features (but they are interrelated)
- Linguistic features
- Presentational features
- Structural/rhetorical features
- Artefacts of print
- Etc.
Mostly you will be responding to obvious physical cues. Asking "what is
this thing?" "What is it here for?" "How does it relate to the other things
here?" Visual cues aren't everything, they can be misleading, but they are
a start. Especially if you are writing instructions for someone else
to recognize features.
Leverage your knowledge. It helps if you know something about...
- genre,
- print conventions,
- language
- subject matter
- author
- period
- the work itself
But expect to be ignorant at least sometimes. Use what you know and
allow for incomplete tagging
On the other hand, sometimes one is left genuinely at a loss.
Sample
Selden Selden
SO: pick a couple of samples and look at them:
- What are the salient features?
- How would you instruct someone to recognize them?
- How do they relate to each other?
- What would you gain by marking up one feature set over another?
- Are there advantages to adding information? normalizing or making explicit?
- Is there anything anomalous or inexplicable? or simply difficult?
- Are there any concurrent (overlapping, conflicting) organizational hierarchies?
|