misoljs/spec.html at main · digling/misoljs · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <meta name="author" content="Johann-Mattis List" />
  <title>spec</title>
  <style>
    code{white-space: pre-wrap;}
    span.smallcaps{font-variant: small-caps;}
    span.underline{text-decoration: underline;}
    div.column{display: inline-block; vertical-align: top; width: 50%;}
    div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
    ul.task-list{list-style: none;}
    pre > code.sourceCode { white-space: pre; position: relative; }
    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
    pre > code.sourceCode > span:empty { height: 1.2em; }
    .sourceCode { overflow: visible; }
    code.sourceCode > span { color: inherit; text-decoration: inherit; }
    div.sourceCode { margin: 1em 0; }
    pre.sourceCode { margin: 0; }
    @media screen {
    div.sourceCode { overflow: auto; }
    }
    @media print {
    pre > code.sourceCode { white-space: pre-wrap; }
    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
    }
    pre.numberSource code
      { counter-reset: source-line 0; }
    pre.numberSource code > span
      { position: relative; left: -4em; counter-increment: source-line; }
    pre.numberSource code > span > a:first-child::before
      { content: counter(source-line);
        position: relative; left: -1em; text-align: right; vertical-align: baseline;
        border: none; display: inline-block;
        -webkit-touch-callout: none; -webkit-user-select: none;
        -khtml-user-select: none; -moz-user-select: none;
        -ms-user-select: none; user-select: none;
        padding: 0 4px; width: 4em;
        color: #aaaaaa;
      }
    pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
    div.sourceCode
      {   }
    @media screen {
    pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
    }
    code span.al { color: #ff0000; font-weight: bold; } /* Alert */
    code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
    code span.at { color: #7d9029; } /* Attribute */
    code span.bn { color: #40a070; } /* BaseN */
    code span.bu { } /* BuiltIn */
    code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
    code span.ch { color: #4070a0; } /* Char */
    code span.cn { color: #880000; } /* Constant */
    code span.co { color: #60a0b0; font-style: italic; } /* Comment */
    code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
    code span.do { color: #ba2121; font-style: italic; } /* Documentation */
    code span.dt { color: #902000; } /* DataType */
    code span.dv { color: #40a070; } /* DecVal */
    code span.er { color: #ff0000; font-weight: bold; } /* Error */
    code span.ex { } /* Extension */
    code span.fl { color: #40a070; } /* Float */
    code span.fu { color: #06287e; } /* Function */
    code span.im { } /* Import */
    code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
    code span.kw { color: #007020; font-weight: bold; } /* Keyword */
    code span.op { color: #666666; } /* Operator */
    code span.ot { color: #007020; } /* Other */
    code span.pp { color: #bc7a00; } /* Preprocessor */
    code span.sc { color: #4070a0; } /* SpecialChar */
    code span.ss { color: #bb6688; } /* SpecialString */
    code span.st { color: #4070a0; } /* String */
    code span.va { color: #19177c; } /* Variable */
    code span.vs { color: #4070a0; } /* VerbatimString */
    code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
    .display.math{display: block; text-align: center; margin: 0.5rem auto;}
  </style>
  <link rel="stylesheet" href="css/pandoc.css" />
  <!--[if lt IE 9]>
    <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
  <![endif]-->
</head>
<body>
<h3 id="overview">1 Overview</h3>
<p>MISOl consists of four major components accessible in four different
tabs of the web interface. The first component defines sound classes and
sound laws. The former allow to group sounds into arbitrary units, and
the latter allow to define how sounds in an ancestral language change
into sounds in a descendant language in a certain context. The second
component allows to convert words in the ancestral language into words
in the descendant language (also known as “forward reconstruction”), and
the third component allows to guess from which words in the ancestral
language a given word in the target language has evolved. The fourth
component allows to import and export data in text form, enabling users
to store their analyses, parse the data with additional software tools,
or to compare different approaches to solve the same problem in
phonological reconstruction. The components are summarized in Figure
1.</p>
<figure>
<img src="img/workflow.png?raw=true" title="Major components of MISOL"
style="width:90.0%" alt="Figure 1: Major components of MISOL" />
<figcaption aria-hidden="true">Figure 1: Major components of
MISOL</figcaption>
</figure>
<h3 id="defining-sound-classes-tab-classes-and-laws">2 Defining Sound
Classes (Tab “Classes and Laws”)</h3>
<p>The table of sound classes defines sounds in a very simple way, by
assigning one or more sounds or sound classes to a certain sound class
label on each line. Sound classes are read in from the first to the last
line and definitions are stored at the time a line is parsed. As a
result, sound class assignments can be combined, and an existing sound
class can be assigned to another class.</p>
<p>The assignment of one or more sounds or sound classes to a given
sound class is represented in the form:</p>
<div class="mycode">
<div class="sourceCode" id="cb1"><pre
class="sourceCode bash"><code class="sourceCode bash"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="ex">name</span> = sound1 sound2 sound3</span></code></pre></div>
</div>
<p>The <code>name</code> of a sound class must be alphanumeric, similar
to typical Python variables, and should not start with a number. The
<code>=</code>-sign must have a space to the left and to the right. All
sounds (or sound groups referenced by invoking an existing sound class
name) must be separated by a space.</p>
<p>Thus, the following examples for sound class definitions are all
<em>wrong</em>:</p>
<div class="badcode">
<div class="sourceCode" id="cb2"><pre
class="sourceCode bash"><code class="sourceCode bash"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="va">name</span><span class="op">=</span>sound1 <span class="ex">sound2</span> sound3</span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="ex">name</span> =sound1 sound2 sound3</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="va">name</span><span class="op">=</span> <span class="ex">sound1</span> sound2 sound3</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a><span class="ex">1name</span> = sound1 sound2 sound3</span></code></pre></div>
</div>
<p>Internally, a sound class is an ordered list of sounds. By assigning
sounds to a sound class, the sounds are made available to the MISOL
system to act as source and target of sound change processes and to be
referenced in sound laws. In addition to referencing groups of sounds
with the help of sound class variables, all individual (“terminal”)
sounds are also represented in as sound classes. The difference is that
these classes have the same label as the sound iself and that they
contain only one element (the sound they refer to).</p>
<p>Furthermore, each final symbol that is identified as a sound (and not
a sound class name) by the MISOL system is also assigned to its own
class with one single manner. Thus, the following line will define as
many as 5 internal sound classes, of which four have one single member,
and the first targets the four only sounds in the system.</p>
<pre><code>my_class = a b c d</code></pre>
<p>Thus, internally, this will result in the following key-value
representation:</p>
<div class="mycode">
<div class="sourceCode" id="cb4"><pre
class="sourceCode json"><code class="sourceCode json"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="fu">{</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;my_class&quot;</span><span class="fu">:</span> <span class="ot">[</span><span class="st">&quot;a&quot;</span><span class="ot">,</span> <span class="st">&quot;b&quot;</span><span class="ot">,</span> <span class="st">&quot;c&quot;</span><span class="ot">,</span> <span class="st">&quot;d&quot;</span><span class="ot">]</span><span class="fu">,</span></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;a&quot;</span><span class="fu">:</span> <span class="ot">[</span><span class="st">&quot;a&quot;</span><span class="ot">]</span><span class="fu">,</span></span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;b&quot;</span><span class="fu">:</span> <span class="ot">[</span><span class="st">&quot;b&quot;</span><span class="ot">]</span><span class="fu">,</span></span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;c&quot;</span><span class="fu">:</span> <span class="ot">[</span><span class="st">&quot;c&quot;</span><span class="ot">]</span><span class="fu">,</span></span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;d&quot;</span><span class="fu">:</span> <span class="ot">[</span><span class="st">&quot;d&quot;</span><span class="ot">]</span></span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a><span class="fu">}</span></span></code></pre></div>
</div>
<p>When parsing sound laws, MISOL automatically checks for sounds that
have not been referenced in the sound class table and adds them as
individual sounds to the table of sound classes. As a result, you do
<strong>not need to define sound classes in order to define sound
laws</strong>.</p>
<p>In order to check which sound classes have been defined in MISOL,
click on the SHOW CLASSES AND LAWS button, after having inserted your
sound class definitions and your sound laws in the <em>Classes and
Laws</em> tab. A table will open and present you all sound classes that
have been defined, including both those classes that you defined
actively, as well as those classes that were inferred automatically from
the sound laws you defined.</p>
<p>When you check the sound classes in MISOL, you will see that the list
of classes shows three sound classes in the beginning, which are
provided independently of what you have defined or not. These reserved
classes, are the symbols <code>^</code>, <code>$</code>, and
<code>-</code>. <code>^</code> refers to the beginning of a sequence and
can be used in the context string of a sound law. The same holds for
<code>$</code> referring to the end of a sequence. <code>-</code> refers
to a specific sound law in which an element is lost (rather than being
replaced by something). It can also be used as a source sound (in the
case of epenthesis, which must be actively modeled) or as a target sound
in a sound law. Other than for this specific purpose, the symbols should
not be used.</p>
<p>Sound classes are a way to model distinctive features that define
individual sounds. The difference between feature-bundle representations
for sounds in sound change models is that features are flexibly defined
on the fly, and modeled rather as “tags” of individual sounds, or a
shortcut to reference the sounds that are tagged with a certain sound
class name in an ordered manner. In our opinion, this comes quite close
to the way feature bundles are used intuitively by linguists so far,
since one can define one’s sound system in a convenient manner, and
provide major distinctions that may play a role in sound laws, such as
voicing distinctions of consonants:</p>
<div class="mycode">
<pre><code>voiced = b d g
voiceless = p t k</code></pre>
</div>
<p>Another important aspect of sound classes is that they can be used as
a shortcut for a group of sounds in sound laws, which often consist of
an abstract set of independent sound changes, rather than an individual
sound change that occurs in one context alone. As a result, one can
refer to both individual sounds and to sound classes in the sound law
descriptions of MISOL.</p>
<p>As a further note on the way in which sound classes are handled in
MISOL, consider the following assignments:</p>
<div class="mycode">
<pre><code>voiced = b d g
voiceless = p t k
consonant = voiced voiceless m n ŋ</code></pre>
</div>
<p>This translates internally to the following major sound class
representations:</p>
<div class="mycode">
<div class="sourceCode" id="cb7"><pre
class="sourceCode json"><code class="sourceCode json"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="fu">{</span></span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;voiced&quot;</span><span class="fu">:</span> <span class="ot">[</span><span class="st">&quot;b&quot;</span><span class="ot">,</span> <span class="st">&quot;d&quot;</span><span class="ot">,</span> <span class="st">&quot;g&quot;</span><span class="ot">]</span><span class="fu">,</span></span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;voiceless&quot;</span><span class="fu">:</span> <span class="ot">[</span><span class="st">&quot;p&quot;</span><span class="ot">,</span> <span class="st">&quot;t&quot;</span><span class="ot">,</span> <span class="st">&quot;k&quot;</span><span class="ot">]</span><span class="fu">,</span></span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a>  <span class="dt">&quot;consonant&quot;</span><span class="fu">:</span> <span class="ot">[</span><span class="st">&quot;b&quot;</span><span class="ot">,</span> <span class="st">&quot;d&quot;</span><span class="ot">,</span> <span class="st">&quot;g&quot;</span><span class="ot">,</span> <span class="st">&quot;p&quot;</span><span class="ot">,</span> <span class="st">&quot;t&quot;</span><span class="ot">,</span> <span class="st">&quot;k&quot;</span><span class="ot">,</span> <span class="st">&quot;m&quot;</span><span class="ot">,</span> <span class="st">&quot;n&quot;</span><span class="ot">,</span> <span class="st">&quot;ŋ&quot;</span><span class="ot">]</span></span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a><span class="fu">}</span></span></code></pre></div>
</div>
<p>Thus, if a sound class like <code>voiced</code> has been assigned to
a list of sounds, the label can be reused in order to assign the same
group of sounds to another sound class. Internally, all sound classes
are only represented as a group of terminal sounds, and only sound laws
can be reused in assignments if they have already be defined. As a
result, the following order of assignments would be problematic:</p>
<div class="badcode">
<pre><code>consonant = voiced voiceless m n ŋ
voiced = b d g
voiceless = p t k</code></pre>
</div>
<p>Since <code>voiced</code> and <code>voiceless</code> have not been
introduced yet with their target group of sounds, the interpreting code
of MISOL would treat them as individual sounds (which can be represented
by any string combination, provided it does not contain a space).
Invididual sounds, however cannot be assigned to another group of
sounds, since they are internally assigned to a group of one sound only,
so the program throws a warning here and ignores the corresponding
line.</p>
<p>Since MISOL does not care how you define your sounds, it offers the
possibility to work with groups of sounds as well as with individual
sounds when dealing with sound change. In order to make sure that we
distinguish groups from individual sounds, the recommendation is to use
a dot <code>.</code> between sounds in a sound sequence in order to
indicate that one is not dealing with individual phonemes. Thus, the
final or rhyme of a Chinese word like <code>[</code>kwaŋ<code>]</code>
could then be conveniently written as <code>a.ŋ</code>. MISOL, however,
will treat this sequence of grouped sounds as an individual sound class
(potentially a terminal one) and not assign it a specific semantics.</p>
<p>Since sound classes are ordered lists of sounds, nothing speaks
against it if you assign the same sound multiple times to one and the
same sound class. This may be important in cases where <em>mergers</em>
are described in complex sound laws that deal with more than one input
sound.</p>
<p>There are two symbols which are automatically defined as sounds,
which cannot be assigned to sound class groups: <code>^</code>
represents the beginning of each word and <code>$</code> the end.
<code>#</code> is reserved as a comment marker.</p>
<h3 id="defining-sound-laws-tab-classes-and-laws">3 Defining Sound Laws
(Tab “Classes and Laws”)</h3>
<p>A sound law is an abstract formula that shows how one or more sounds
in an ancestral language are converted to one or more sounds in a
descendant language. It has the general formula:</p>
<div class="mycode">
<pre class="shell"><code>source &gt; target / context</code></pre>
</div>
<p>The number of source sounds and target sounds must be identical and
the context is optional and can be omitted:</p>
<div class="mycode">
<pre class="shell"><code>source &gt; target</code></pre>
</div>
<p>The change marker <code>&gt;</code> must be preceded and followed by
a space. So the following lines would be erroneous and lead to
errors.</p>
<div class="badcode">
<pre class="shell"><code>source&gt; target
source &gt;target
source&gt;target</code></pre>
</div>
<p>The same strict rules apply for the marker separating the context,
the slash, which must be preceded and followed by at least one space. So
again, the following lines will all yield errors and as a result, the
line will be ignored.</p>
<div class="badcode">
<pre class="shell"><code>source &gt; target/ context _
source &gt; target /context _
source &gt; target/context</code></pre>
</div>
<h4 id="source-and-target-in-sound-laws">3.1 Source and Target in Sound
Laws</h4>
<p>Source and target can be either a single sound, sound class, or list
of sounds (indicated by square brackets) or a sequence of sounds. If a
sequence of sounds is provided, this will be interpreted internally by
invoking two or more separate sound laws. Thus, writing</p>
<div class="mycode">
<pre class="shell"><code>a b &gt; [c d] / x _ y</code></pre>
</div>
<p>is equivalent to writing</p>
<div class="mycode">
<pre class="shell"><code>a &gt; c / x _ b y
b &gt; d / x a _ y</code></pre>
</div>
<p>Allowing to define consecutive sounds is thus a mere shortcut but it
can come in handy in those cases where it seems difficult to define
complex sound laws.</p>
<p>If you pass a sound class, a single sound, or a list of sounds does
not make a difference. Thus, if you have defined a sound class
<code>ptk</code> as shortcut for the sounds <code>p</code>,
<code>t</code>, and <code>k</code>, the following two statements are
equivalent:</p>
<div class="mycode">
<pre class="shell"><code>ptk &gt; ptk
[p t k] &gt; [p t k]</code></pre>
</div>
<p>The same holds for sequences of sounds:</p>
<div class="mycode">
<pre class="shell"><code>ptk ptk &gt; ptk ptk
[p t k] [p t k] &gt; [p t k] [p t k]</code></pre>
</div>
<p>Mixing is also possible.</p>
<div class="mycode">
<pre class="shell"><code>ptk [p t k] &gt; [p t k] ptk</code></pre>
</div>
<p>Note, however, that it is essential that the source and the target
always contain the <em>same amount of sounds</em> and the <em>same
amount of positions</em>. If you want to indicate the loss of a sound,
use the <code>-</code> as gap marker:</p>
<div class="mycode">
<pre class="shell"><code>ptk ptk &gt; ptk [- - -]</code></pre>
</div>
<p>Note in this example, that you cannot write a single gap symbol, but
must assemble a group (or define a sound class with the group before),
since we require to have one target sound for each source sound and vice
versa. This means also, that you must repeat a sound when using sound
class notations to formulate sound laws, where a merger happens.</p>
<div class="mycode">
<pre class="shell"><code>[p t k] &gt; [p p p]</code></pre>
</div>
<p>If you want to indicate that one sound turns into two sounds, which
could happen in the case of epenthesis, you must provide the two sounds
that replace the one sound in the original separated by a dot, as
follows:</p>
<div class="mycode">
<pre class="shell"><code>n &gt; n.d / _ r [a e i o u]</code></pre>
</div>
<p>MISOL will internally replace the sound by the sequence
<code>n.d</code> in the respective context, but the final output will
provide the sound in merged form.</p>
<p>MISOL will in all cases represent sound laws individually, on the
basis of one source sound corresponding to one target sound in one
individual context.</p>
<h4 id="context-in-sound-laws">3.2 Context in Sound Laws</h4>
<p>The context typically has the form:</p>
<div class="mycode">
<pre class="shell"><code>left_context _ right_context</code></pre>
</div>
<p>Here, <code>_</code> represents the source sound. Both left and right
context can be omitted.</p>
<div class="mycode">
<pre class="shell"><code>left_context _
_ right_context</code></pre>
</div>
<p>Context in left and right context is identically defined by a
segmental representation of the sound sequence proceeds to the left in
the left context and to the right in the right context. In this way,
theoretically even very long ranging contexts can be modeled. If one
wants to change an <code>[s]</code> followed by <code>[p, t, k]</code>
and a vowel to <code>[ʃ]</code>, one can write:</p>
<div class="mycode">
<pre class="shell"><code>s &gt; ʃ / _ [p t k] vowel</code></pre>
</div>
<p>Here, the square brackets <code>[</code> and <code>]</code> are used
to indicate that the three sounds <code>p</code>, <code>t</code>, and
<code>k</code> represent a group that can alternatively occur in the
second position following the source sound. A full-fledged toy example
that would model that an <code>s</code> becomes voiced when followed by
a vowel and turns into a <code>ʃ</code> when followed by a consonant and
a vowel, one could define the following sound classes:</p>
<div class="mycode">
<pre class="shell"><code>consonant = p t k b d g s z ʃ
ptk = p t k b d g
vowel = a e i o u</code></pre>
</div>
<p>These could then be used in four sound laws:</p>
<div class="mycode">
<pre class="shell"><code>s &gt; ʃ / _ [p t k] vowel
s &gt; z / _ vowel
p t k b d g &gt; p t k b d g
vowel &gt; vowel</code></pre>
</div>
<p>These would turn a word like <code>s t a b</code> into
<code>ʃ t a b</code> but would turn <code>s a b</code> into
<code>z a b</code>. Defining groups of sounds with square brackets can
be done in a very flexible manner, and even sound classes can be placed
inside brackets in order to form new groups of sounds. One can, for
example, define two groups of sound classes for voiced and voiceless
sounds as follows:</p>
<div class="mycode">
<pre class="shell"><code>ptk = p t k
bdg = b d g</code></pre>
</div>
<p>These can then be used in combination in a sound law.</p>
<div class="mycode">
<pre class="shell"><code>a &gt; e / _ [ptk bdg]</code></pre>
</div>
<h4 id="using-tiers-in-sound-laws">3.3 Using Tiers in Sound Laws</h4>
<p>MISOL is based on the idea that a sound sequence is often best
represented as a sequence consisting of multiple <em>tiers</em> (similar
to multi-tiered annotation of texts in linguistic examples, such as
interlinear-glossed text), that is, a matrix in which different aspects
of the sequence are treated in segmental form. Tone, for example, can
often be thought of as representing the whole syllable of a word in some
tone languages, rather than only one of the sounds in the syllable, or
the vocalic nucleus.</p>
<p>MISOL supports using multi-tiered sequences in two ways. First, one
can define multi-tiered sequences in a very flexible fashion by just
providing a matrix of symbols with as many tiers as one wants to use
instead of using only one tier alown. A word in a tonal language,
consisting of two syllables with two distinct tones, could thus be
represented in the following form:</p>
<div class="mycode">
<pre class="shell"><code>p a ŋ t a n
¹ ¹ ¹ ² ² ²</code></pre>
</div>
<p>In a similar way, one can represent stress in a word, e.g., by using
the symbol 1 for stressed syllables and the symbol 0 for unstressed
syllables.</p>
<div class="mycode">
<pre class="shell"><code>f a t ə r
1 1 0 0 0</code></pre>
</div>
<p>Different tiers apart from the first segmental tier (called
<code>segments</code> in MISOL) can be addressed in the context
definitions in all positions by using the symbol <code>@</code>,
followed by the name of the tier, in front of the group of sounds
(marked by square brackets), preceding or following the sound in
question (or the sound itself). Thus, to indicate that an unstressed
<code>[t]</code> turns into a <code>[d]</code>, one can write:</p>
<div class="mycode">
<pre class="shell"><code>t &gt; d / @stress[0]_</code></pre>
</div>
<p>When applying this sound law in forward reconstruction, one must
provide both tiers (the segments tier and the stress tier) and pass the
names of these tiers in the text field to the right of the field where
one inserts the sound sequences to be modified, as shown in Figure
2.</p>
<figure>
<img src="img/example-3-3.png" style="width:90.0%"
alt="Figure 2: Using multi-tiered sequence representations in sound laws." />
<figcaption aria-hidden="true">Figure 2: Using multi-tiered sequence
representations in sound laws.</figcaption>
</figure>
<p>The most common use-cases for sound laws with tiers is to define the
specific tier value that holds for the segment that one intends to
change (such as we have seen in the example). Other use cases, however,
are also possible, when thinking of cases where a certain tier value
holds for preceding or following sounds.</p>
<p>Instead of actively <em>defining</em> and <em>passing</em> new tiers
for individual segmental representations of sound sequences, one can
also make use of inbuilt functions in MISOL that <em>compute</em> tiers
automatically. An example is again the use of a specific tonal tier
(called <code>tone</code>), which is computed from the segmental
representation of tones using superscript letters at the end of each
syllable. Thus, passing a sound sequence such
<code>p a ŋ ⁵ d a n ¹</code> will automatically yield a virtual
representation such as the following one internally in the MISOL
program:</p>
<div class="mycode">
<pre class="shell"><code>p a ŋ ⁵ d a n ¹
⁵ ⁵ ⁵ ⁵ ¹ ¹ ¹ ¹</code></pre>
</div>
<p>As a result, tonal tiers can be used, as long as the tone values are
indicated by superscript numbers at the end of the syllable in each
sequence. In order to invoke these tonal tiers, one must indicate this
actively when using the forward reconstruction method (or the backward
reconstruction), by adding <code>@tone</code> as an additional tier, as
shown in Figure 3, assuming the following sound law:</p>
<div class="mycode">
<pre class="shell"><code>d &gt; t / @tone[¹]_</code></pre>
</div>
<figure>
<img src="img/example-3-3-b.png" style="width:90.0%"
alt="Figure 3: Using multi-tiered sequence representations in sound laws with precomputed tiers." />
<figcaption aria-hidden="true">Figure 3: Using multi-tiered sequence
representations in sound laws with precomputed tiers.</figcaption>
</figure>
<p>Apart from tonal tiers (called <code>tone</code> in MISOL), MISOL
currently offers three more tiers that can be automated, one tier that
checks for the nasality of whole words (returning <code>1</code> if a
words contains a nasal sound or a nasal vowel, and <code>0</code>
otherwise), one tier for the initial sound in a word (returning the
initial sound for each letter in the word, called <code>initial</code>),
and one tier that handles the stress patterns of the word (requiring a
specific annotation that uses stress markers to mark syllable boundaries
in an explicit manner).</p>
<p>As an example, consider the following three sound laws that all use
one of the three automated tiers.</p>
<div class="mycode">
<pre class="shell"><code>d &gt; t / @tone[¹]_
p &gt; b / @initial[k]_
b &gt; m / @nasal[1]_</code></pre>
</div>
<p>Figure 4 shows, how these can be applied to individual sequences, and
where in the tool one needs to provide the information on the tiers that
one intents to use.</p>
<figure>
<img src="img/example-3-3-c.png" style="width:90.0%"
alt="Figure 4: Comparing different precomputed tiers in MISOL." />
<figcaption aria-hidden="true">Figure 4: Comparing different precomputed
tiers in MISOL.</figcaption>
</figure>
<p>In order to handle stress, stress must be marked in a specific
fashion that differs from the current handling in the International
Phonetic Alphabet. First, stress markers must be placed in front of
every syllable in a word, not only in front of stressed syllabes.
Second, stress markers receive their own slot, they should be separated
by a space from the rest of the sequence. Third, stress markers must
start with either the IPA stress marker <code>ˈ</code> or the quotation
mark <code>'</code> (for convenience), or the secondary stress marker
<code>ˌ</code>, but stress markers can be expanded by adding arbitrary
symbols, allowing to mark different kinds of stress, such as “unstressed
by following a stressed syllable” (which would be needed for Verner’s
law. Thus, when defining the following sound law, one can handle
Verner’s law as well, by passing the sequence
<code>ˈ f a ˌ t e r</code>, in which we assume that the secondary stress
marker refers to unstressed syllables following a stressed syllable.</p>
<div class="mycode">
<pre class="shell"><code>[p t k] &gt; [f θ x] / @stress[ˌ]_</code></pre>
</div>
<h4 id="consecutive-sound-laws">3.4 Consecutive Sound Laws</h4>
<p>The basic idea of MISOL is that sound laws often happen at the same
time. For this reason, MISOL does not chain individual sound laws in
order in order to let them derive a new sequence to which the next sound
law is applied, but rather applies them all at once to the original
context.</p>
<p>In most cases, these synchronous sound laws are sufficient and much
easier to handle than consecutive sound laws. In some cases, however, it
is clear that sound laws were active at different stages, called
<em>layer</em> in MISOL. Thus, MISOl allows you to “layer” your sound
laws by assigning them to different layers which are then executed in
the order in which you provide them. As a simple example, consider how
Latin <em>generu-</em> became <em>gendre</em> in French. While one could
model this change in synchronous sound laws in many ways (for example
also by just replacing <em>e</em> by a <em>d</em> when occurring between
<em>n</em> and <em>r</em>), it is much easier and also closer to actual
sound change to think of two different major changes that took place
here. First, <em>generu</em> becomes <em>genru</em>, and then the
epenthetic <em>n</em> emerges.</p>
<p>In order to model these sound laws in MISOL, you must assign
individual sets of sound laws to a layer, by adding the layer name,
placed in an equal sign, separated with a space, to the left and the
right of the layer label.</p>
<div class="mycode">
<pre class="shell"><code>= Layer 1 =
e &gt; - / n _ r

= Layer 2 =
n &gt; n.d / _ r [a e i o u]</code></pre>
</div>
<p>When applying these sound laws, MISOL will display all internal
results, allowing you to track all intermediate forms that lead to the
final form proposed by the tool, according to your sound laws.</p>
<h3 id="forward-reconstruction-tab-forward-reconstruction">4 Forward
Reconstruction (Tab “Forward Reconstruction”)</h3>
<p>Forward reconstruction in MISOL is available in different flavors.
What all approaches have in common is that MISOL first uses all
available information on sound classes and sound laws in order to
construct a virtual context window in which sound laws are supposed to
take place. This window can be thought of as a multi-tiered sequence
representation in which context is not handled on the horizontal axis,
but precomputed for each segment in a sequence and represented in
individual tiers, each corresponding to one specific context. For a
sound law by which voiceless initials are voiced in intervocalic
positions, for example, we would need two tiers apart from the base tier
in order to represent context to the left of each segment and context to
the right of each segment. The sound law could be represented as follows
in MISOL (without using any sound classes), we add additional sound laws
for completeness.</p>
<div class="mycode">
<pre class="shell"><code>[p t k] &gt; [b d g] / [a i u] _ [a i u]
[b d g] &gt; [b d g]
[a i u] &gt; [a i u]</code></pre>
</div>
<p>When dealing with a new sequence <code>b a p a</code> now, MISOl has
already inferred from the sound law, that we need two additional tiers,
and will now represent the new sequence accordingly, by providing for
each segment its right context and its left context.</p>
<table>
<thead>
<tr class="header">
<th>Tier</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Segments</td>
<td>b</td>
<td>a</td>
<td>p</td>
<td>a</td>
</tr>
<tr class="even">
<td>Segments<code>_</code>Left</td>
<td>^</td>
<td>b</td>
<td>a</td>
<td>p</td>
</tr>
<tr class="odd">
<td>Segments<code>_</code>Right</td>
<td>a</td>
<td>p</td>
<td>a</td>
<td>$</td>
</tr>
</tbody>
</table>
<p>From the sound laws shown above, MISOL will construct vectors that
represent individual contexts. These sound laws are based on the
Cartesian product of the different sounds that can appear in the left
and the right context and thus capture all eventualities, as shown in
the table below, that shows the individual vectors for the three sound
laws (in abbreviated form).</p>
<div class="mytable">
<table>
<thead>
<tr class="header">
<th>ID</th>
<th>Law</th>
<th>Segments</th>
<th>Segments<code>_</code>Left</th>
<th>Segments<code>_</code>Right</th>
<th>Target</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>1</td>
<td>1</td>
<td>p</td>
<td>a i u</td>
<td>a i u</td>
<td>b</td>
</tr>
<tr class="even">
<td>2</td>
<td>1</td>
<td>t</td>
<td>a i u</td>
<td>a i u</td>
<td>d</td>
</tr>
<tr class="odd">
<td>3</td>
<td>1</td>
<td>k</td>
<td>a i u</td>
<td>a i u</td>
<td>g</td>
</tr>
<tr class="even">
<td>4</td>
<td>2</td>
<td>b</td>
<td>Ø</td>
<td>Ø</td>
<td>b</td>
</tr>
<tr class="odd">
<td>5</td>
<td>2</td>
<td>d</td>
<td>Ø</td>
<td>Ø</td>
<td>d</td>
</tr>
<tr class="even">
<td>6</td>
<td>2</td>
<td>g</td>
<td>Ø</td>
<td>Ø</td>
<td>g</td>
</tr>
<tr class="odd">
<td>7</td>
<td>3</td>
<td>a</td>
<td>Ø</td>
<td>Ø</td>
<td>a</td>
</tr>
<tr class="even">
<td>8</td>
<td>3</td>
<td>i</td>
<td>Ø</td>
<td>Ø</td>
<td>i</td>
</tr>
<tr class="odd">
<td>9</td>
<td>3</td>
<td>u</td>
<td>Ø</td>
<td>Ø</td>
<td>u</td>
</tr>
</tbody>
</table>
</div>
<p>When iterating over each position in the multi-tiered sequence, MISOL
will try to find which of the laws (as shown in the table) provides a
vector that matches the current vector in the sequence. The symbol
<code>Ø</code> is a wildcard marker and matches with every sound. For
our example, we can contrast the actual tiers with the precomputed
individual sound laws as shown below.</p>
<div class="mytable">
<table>
<colgroup>
<col style="width: 20%" />
<col style="width: 20%" />
<col style="width: 20%" />
<col style="width: 20%" />
<col style="width: 20%" />
</colgroup>
<thead>
<tr class="header">
<th>Tier</th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>Segments</td>
<td>b</td>
<td>a</td>
<td>p</td>
<td>a</td>
</tr>
<tr class="even">
<td>Segments<code>_</code>Left (Source / Context)</td>
<td>^ / Ø</td>
<td>b / Ø</td>
<td>a / <strong>a</strong> i u</td>
<td>p / Ø</td>
</tr>
<tr class="odd">
<td>Segments<code>_</code>Right (Source / context)</td>
<td>a / Ø</td>
<td>p / Ø</td>
<td>a / <strong>a</strong> i u</td>
<td>$ / Ø</td>
</tr>
<tr class="even">
<td>Sound Law / ID</td>
<td>2 / 4</td>
<td>3 / 7</td>
<td>1 / 1</td>
<td>3 / 7</td>
</tr>
<tr class="odd">
<td>Target</td>
<td>b</td>
<td>a</td>
<td>b</td>
<td>a</td>
</tr>
</tbody>
</table>
</div>
<p>This is the major procedure used by MISOL in order to turn one source
sequence into one target sequence. The method (1) precomputes the
context for the sequence, which allows it to (2) iterate over each
position regardless of the order, searching for matching patterns and
their corresponding output.</p>
<h4 id="strict-and-ordered-reconstruction-mode">4.1 Strict and Ordered
Reconstruction Mode</h4>
<p>The procedure as outlined here is what is called the “strict” mode in
MISOL when using forward reconstruction. It is called “strict” with
respect to the mode of reconstruction, since it does not tolerate that
different sound laws match the same context and yield different output
(MISOL will explicitly mark these cases). Users can choose between the
strict mode and the “ordered” mode, in which the matching process is
modified in such a way that in the case of competing sound laws, the law
that was defined first, wins. This makes coding sound laws much more
convenient, since one can first define a very strict law and later
define a general law that would hold for all other cases not touched by
this first law.</p>
<div class="mycode">
<pre class="shell"><code>s &gt; ʃ / _ [p t k]
s &gt; s</code></pre>
</div>
<p>When passing a sequence <code>s p a s a</code> to this sound law, it
would yield the output <code>s|ʃ p a s a</code> in strict mode, and
<code>ʃ p a s a</code> in ordered mode, using the pipe to indicate
competing sounds (see <a
href="https://aclanthology.org/2023.lchange-1.3">List et al. 2023</a>
for this notation).</p>
<h4 id="treatment-of-missing-sound-laws">4.2 Treatment of Missing Sound
Laws</h4>
<p>Scholars typically omit the “boring” sound laws from their
descriptions, specifically those cases of sound change, where no sound
change happens. Thus, we rarely find a sound law as the following in the
literature.</p>
<div class="mycode">
<pre class="shell"><code>t &gt; t</code></pre>
</div>
<p>MISOL tolerates the omission of not providing sound laws in those
cases where sounds don’t change. The <em>Forward Reconstruction</em> tab
offers users to select whether missing sound laws should be
<em>marked</em> or <em>ignored</em>. If they are ignored, the original
sound is used unchanged. If they are marked, the sound will be preceded
by an exclamation mark and marked in red.</p>
<h4 id="tiers-in-consecutive-sound-change-processes">4.3 Tiers in
Consecutive Sound Change Processes</h4>
<p>At the moment, you can only pass explicit tiers (such as accent)
once. Since their main purpose is to be able to explain the initial
change, they cannot be used in consecutive sound law processes, since
this would require us to add a routine by which tiers from a source word
turn into tiers from a target word. +++Using computed tiers is not yet
implemented by will be available at some point.+++</p>
<h3 id="backward-reconstruction">5 Backward Reconstruction</h3>
<h3 id="import-and-export">6 Import and Export</h3>
</body>
</html>