Description
The second point in tfd-nusantara-philology issue #184 reminds me of an unfinished email discussion with some colleagues about how to encode marginalia in manuscripts. Since we are trying to have the encoding model for diplomatic editions of manuscripts be aligned maximally with what we do for inscriptions, it would be best of we take a decision that will apply in EGD and then inherit from there for manuscripts.
I represent the earlier email discussion here:
- A phenomenon commonly found in palmleaf mss., in my experience, is for brief expressions related to the contents of a given leaf to me made in the (left) margin of a folio. Here is an example from Bali (photo attached):

Here is transcription in process, to be converted later to proper TEI. The marginal expression hapakrama matches with the same in 5v4. Any recommendations for the proper TEI element to be used for representing such marginalia? Should they be included in <fw>
, or not?
<fw n="5v">5</fw>
<margin>**hapakrama**</margin>
[5v1]ṅatasmīn·,
hananya tan tumut riṁ pr̥thivī, jala:pra:kr̥tti.0. nya:ṁ śastrokta, Ikaṁ vastu tuṅga[SH]l·, maṅḍe pa:paniṁ makveḥ. guru, Apakrama:ṁ*, kr̥tva, ca, ti, toḥ, ga:mya-ga:manaḥ, śavapucyaḥ, tri
[5v2]ṇa:ṁ, śiṣya(ḥ), patitaḥ, praṇa:tenaḥ, tu. harṭa:nya, saṁ panaḍaha:n· bhaṣma:, yan apakrama, maṅū[SH]lah dūrśśila, yeka:pakrama:. hagamya-gamana:, ṅa, yan saṁ vatək vikva:, yan vratya, Ika:naṅ kumurR̥n·
[5v3]strī /ta\n ṣayogya, pakanakbya,. strī laraṅan·, bva:t javanya:nḍenatitalakṣaṇa:, malavas sira ma[SH]ka:pravr̥tyaṁ maṅkana:. niha:n katlunya, tekaṁ vikū puccaśava:, saṁ panaḍaha:n saṁska:ra, sḍa:ṅ hiṁ praṇantika:
[5v4]nira, kataman· vastu śudḍa:, mvaṁ **hapakrama**, yeka: patitasava:, ṅa, patitani{ṁ}kaṁ triviḍa, {?} ke[SH]lu patita taṁ śiṣya kabeḥ, maka:nimitaṁ bhaktiniṁ śiṣya, mapa lvirniṁ paka:strīnika:ṅ aga:mya-ga:mana:, niha:n śru
-
@danbalogh responded initially: If I understand rightly that these occur on several pages and relate to the contents of a page, then
<fw>
is acceptable, but there may well be a more dedicated TEI element. -
Patrick McAllister then responded: I agree, you could include it in tei:fw if this feature is repeated on many/all folios. I think I would have gone for tei:label, however, since I keep forgetting what a "forme work" is supposed to be (and once I do look it up I discover that I get a strong feeling it shouldn’t be used for manuscripts, in disagreement with the TEI). Your case seems similar to the third example here:
https://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-label.html. A third option could also be tei:head, if your schema allows it at that point and if (you think) this feature has a sectioning function. -
I reacted: Since we use
<fw>
for encoding marginal page numbering, what would seem most logical to me is to use<label>
and wrap it inside<fw>
:<pb n="16v"/><fw place="left" n="16v"><label>TEST</label><num value="16">16</num></fw>
. Whether due to general TEI rules or due to the schema in the file I used to test this, it doesn’t seem to be valid. I also tested the following alternative:<pb n="16v"/><fw place="left" n="16v"><num value="16">16</num></fw><label place="margin">TEST</label>
. But this doesn’t seem to be valid either. By the way, @dan, why don’t we encode page numbers as per the example given at https://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html#PHSK?<fw type="pageNum" place="top-right">29</fw>
It seems that it might be possible to use two s consecutively although again our schema(s) currently don’t seem to allow it.<pb n="16v"/><fw type="pageNum" place="left">16</fw><fw type="label" place="left" n="16v"></fw>
-
Dan reacted again: It seems to me that the prohibition of
<label>
inside<fw>
is TEI and not specific to our schema. So you'll have to use either one or the other. Anyway, I don't see an advantage to using both. Although Patrick is right that<fw>
is essentially meant for printed stuff (the term "forme work" being a printing technical thing), I have absolutely no problem with using it for hand-written documents when the effect is similar to forme work in the literal sense. In fact, now that I look at the guidelines, it actually says this explicitly: "Although the name derives from the term forme work, used in description of early printed documents (the ‘forme’ being the block used to hold movable type), the fw element may be used for such features of any document, written or printed." So I would suggest choosing along the following lines:
- if these things occur on most pages, and they are always or usually in the same location with respect to the page, I would choose fw
- if they occur only on occasional pages, I would choose label
- Moreover, if the location varies, I would prefer label associated with (placed directly after the opening tag of) a
<p>
or<lg>
element, or even an<lb/>
if the labels can be associated with lines but not with a specific semantic/prosodic chunk. - Once you've made your decision, I'm pretty sure @michaelnmmeyer [AG: this now to be handled by @ajaniak ?] can adjust the schema to accommodate the preferred solution. Two
<fw>
elements should, in any case, be permitted by our schema, as the EGD (§3.3.5) says "should your inscription have two (or more) foliation marks on a single page, encode two (or more) elements one after the other, in an order that seems most logical". @michaël: if that is non-conformant to TEI, please let me know because then it'll have to be removed from the EGD and an alternative solution needs to be recommended. - Re Arlo's question "why don’t we encode page numbers as per the example" - I don't see a substantive difference between the example and our encoding. Minor differences I see are: We have an @n, which is not there in the TEI guidelines example. I have no recollection why; I believe it was Axelle's suggestion to make identification clearer. If you think we should discard those n-s, and Michaël says we don't need them for any technical purpose, then I don't mind discarding them.
- The TEI example has a @type, which is not there in our encoding. Since, in the EGD at least, we did not foresee multiple types of forme work, there is no need to add an explicit type. Note that the old Guide says, "if you encounter a different [=other than page/folio numbering] feature that you believe ought to be marked up as forme work, please discuss the issue with the authors of the Guide". In the new release, the corresponding instruction is: "In our encoding practice, the use of forme work shall be restricted to cases where very short, identical or similar pieces of text appear in conjunction with specific pages of a document involving pagelike partitions. In other cases, consider whether the text item you are dealing with is rather an incipit (§3.8.2) or a colophon (§3.8.3), and if neither is applicable, consult the authors of this Guide."
- Be that as it may, given that in epigraphic texts forme work is pretty rare and any types other than numeration are likely to be extremely rare to nonexistent, I still see no need for a type. If you choose fw for the above situation AND you want to distinguish it with @type from pagination/foliation marks in the manuscript guide, I'm OK with that and not opposed to adopting the same in the EGD for consistency across our editions.
- I asked: I am not sure that such things occur on most pages of any documents that contains any of them. Is this requirement of predominant presence a hard one?
- Am I right that it derives from the guidelines speaking of 'any of the unchanging portions of a page forme’?
- I think that the need to encode such things will arise mainly in what I have so far called ‘diplomatic editions’ (other than epigraphic editions). These tend to be very low on semantic containers, so we need to be able to associate whatever element we adopt for encoding them with
<lb/>
or<pb/>
. - Yes, consistency of encoding models between our different types of editions is one of my concerns.
<label>
seems a bit annoying to me because apparently it requires stating @xml:lang in every instance.- If we can be soft on the requirement of occrruing on most pages, then I think I would prefer to work with two @types of
<fw>
, one ‘pageNum’ and one ‘label’. Both should bear @n and @place as well. The substantive difference I had in mind was indeed the use of @type on<fw>
which is present in all examples in the TEO guidelines but not in our encoding model so far. If not, maybe we can consider encoding with<note type="marginal">
to be placed right next to the closest relevant or<lb/>
?
- Dan responded again: The TEI guidelines you cite continue explicitly: "The fw element may be used to encode any of the unchanging portions of a page forme, such as:
running heads (whether repeated or changing on every page, or alternating pages)
running footers
page numbers
catch-words
other material repeated from page to page, which falls outside the stream of the text
It should not be used for marginal glosses, annotations, or textual variants, which should be tagged using gloss, note, or the text-critical tags described in chapter 13 Critical Apparatus, respectively."
- Since in our texts, pagination/foliation numbers are not always present on every page/folio, I don't think we need something to be present in every page for it to qualify as forme work. But given the above, I think we should not use fw for anything that is not at least predominantly present. You haven't said as much, but I assume from your questions that the things you want to encode are more sporadic than that, and that means fw should no longer be considered an encoding option.
<label>
does not require @xml:lang. In our current encoding practice, adding @xml:lang to<label>
is mandatory because the only function for which we use this element is editorial labelling in editions. The editions are always in a source language, and the presence of @xml:lang="eng" serves to make it clear that the contents of the label don't belong to the source text. This leaves us free to use label without the language attribute for any labels which do belong to the source text.- if we keep using fw only for the present purpose, then we should not introduce @type since it would be a needless complication of our encoding.
- the TEI guidelines seem, perhaps intentionally, vague on the intended use of the elements
<note>
,<label>
and<gloss>
. They certainly imply that<note>
is, at least primarily, for things already present in the text that is being encoded, and not for notes added by the encoder. So as far as I am concerned,<note>
is OK. The guidelines themselves seem to make no hard distinction between note and label, at one point saying that "This type of annotation, very common in the early printed texts which Coleridge may be presumed to be imitating in this case, may also be regarded as providing a heading or descriptive label for the passage concerned. The encoder may therefore prefer to use the label element to represent it". - further, the description of the element label even says, " Labels may also be used to represent a label or heading attached to a paragraph or sequence of paragraphs not treated as a structural division, or to a group of verse lines. Note that, in this case, the label element appears within the p or lg element, rather than as a preceding sibling of it."
- given all of the above considerations, and assuming that these thingies are far from ubiquitous throughout the MS(s) where they occur, my preference would be to stick to label. Without @xml:lang, with @place taking the same values as used for
<fw>
. If distinguishing these with @type from the so-far-used editorial labels would be good practice, then I suggest that "editorial" be the default value of @type on<label>
(so it does not have to be explicitly added to the pre-existing editorial labels), and for these labels we use @type="original" or whatever value Arlo prefers. The label element could then be placed at any convenient point in the text, e.g. right after the<lb>
representing the line with which the thingy is aligned.
- Let's take a decision now. I think I am happy to go with
<label>
placed directly after the<pb>
for the page (folio-side) on whose margin it occurs, or after<lb>
is it is intended to align with a specific line. But some questions remain.
- If there is clearly more than one such marginal note on a page, can we have more than one `?
- Can we use
<lb>
and other diplomatic editing tags inside<label>
? (Marginalia are often laid out over multiple lines.)