Help for EOPAS: Frequently asked questions
How to cite a phrase, word or morpheme?
In the display page for a transcript, the URL of the current context is displayed at the top right.
You can cut and paste this URL to directly link to the current context.
There are two types of context that are being tracked.
The first is the context of the currently active phrase. Such a URL looks as follows:
You will notice that there is a fragment on this URL that starts with "#t=". This means that we are looking
at a time range on the video or audio file that is presented on this Web page. The time segment
202.028 seconds and goes until
As a video or audio element plays back and goes through the annotation phrases, the URL at the top right is adapted.
You can pause the video or audio element and cut and paste the URL for citation purposes.
Alternatively, you can also right click on the little black "play" button of a particular phrase
and copy the link address.
The second is the context of a single word or morpheme. This is also tightly linked to concordance searches
which are activated by clicking on a word or morpheme and search that same entity in all documents in the
As you click on a word or morpheme, a URL that looks as follows is displayed in the top right corner:
The pattern of that URL is such that "#!" identifies this concordance linking, "/p" identifies the phrase by number,
"/w" identifies the word within that phrase by position, "/m" identifies (where necessary) the morpheme within that
word by position.
Where are the XML schemas of the formats?
Where are the XSL Transforms to convert between formats?
What format does a Toolbox input file have to be for import?
Some Toolbox files use camel-case on element and attribute names.
Others come with a namespace of "tb:" on all the elements.
These differences will be removed using a clean-up script called fixToolbox.xsl.
The following mapping of Toolbox elements to EOPAS is undertaken:
itmgroup -> get each
idgroup -> phrase
concat txgroup/tx -> transcription
txgroup -> wordlist
tx -> word/text
mr -> morphemelist/morpheme/text@kind['morpheme']
mg -> morphemelist/morpheme/text@kind['gloss']
fg -> translation
What format does a Transcriber input file have to be for import?
In EOPAS we don't know about different speakers, so speaker turns are removed.
The EOPAS XML file format moves the speaker information on the phrase.
Topic information and the sections are removed.
The following mapping of Transcriber elements to EOPAS is undertaken:
Episode -> get each
Sync -> phrase with Turn@speaker
Sync.content() -> transcription
Comment@desc -> transcription with "[" "]"
Event@desc+@extent -> transcription with "[" "]"
What format does an Elan input file have to be for import?
Elan allows a vast combination of tier types and the choice of tier names is up to the author.
The EOPAS file format only supports a limited structure, so only three different types of Elan files are being supported.
Option 1: The following mapping of Elan elements with default-lt tiers is undertaken:
Option 2: The following mapping of Elan elements with utterance tiers is undertaken:
Option 3: The following mapping of Elan elements with ref tiers is undertaken:
tier=ref -> phrase/transcription
tier=tx -> wordlist/word/text
tier=mr -> morphemelist/morpheme/text@kind['morpheme']
tier=mg -> morphemelist/morpheme/text@kind['gloss']
tier=fg -> translation
tier=graid -> graid annotation
What is involved in writing support for a new format?
EOPAS is based on the assumption that all import formats are provided in XML and thus have a XML schema.
To support a new input format, one has to provide a XML Schema for that format and place it in the
directory "public/SCHEMAS" on the server. Next one has to to create a XSL Transform that will convert that
XML format to the EOPAS XML format. The XSLT is placed in the "public/XSLT" directory.
Finally, one has to change the list of available input format in "app/models/transcript.rb" to make it an
available format in the upload process.
For development of the XSLT, there are helper functions in the "bin" directory of the application for
validation, transcoding and general running of a xsl transform. Further, one should add some example files
into the "features/test_data" directory and add a test to "features/transcript.feature". Do not forget to
finish it all off with documentation in "doc/TRANSCODING".