forked from KanjiVG/kanjivg.github.com
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathsvg-format.html
More file actions
478 lines (397 loc) · 15.7 KB
/
svg-format.html
File metadata and controls
478 lines (397 loc) · 15.7 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
---
layout: default
title: SVG Format
name: svg-format.html
---
<div class="page-header">
<h1>SVG Format</h1>
</div>
<p>
This page details the SVG extensions used by KanjiVG to add
information on components of kanji, such as the radicals, as well as
information on the expected shapes of strokes.
</p>
<p>
All of the SVG extensions used by KanjiVG are XML attributes with an
added <code>kvg:</code> suffix, a "namespace" in XML terminology.
</p>
<p id="strokepaths-intro">
There are two
root <a href="http://www.w3.org/TR/SVG/struct.html#Groups">SVG
groups</a>. The <strong>StrokePaths</strong> group is a set of
standard SVG paths that gives the strokes of the kanji. The strokes
are ordered in the <a href="glossary.html#stroke-order">stroke
order</a> given by the <a href="ref.html">references</a>. It also
contains groups which describe the structure of the kanji, such as
its decomposition into <a href="glossary.html#element">elements</a>,
and which strokes comprise
its <a href="glossary.html#radical">radical</a>.
</p>
<p id="strokenumbers-intro">
The <strong>StrokeNumbers</strong> group is an optional group that
gives a convenient position
for <a href="#stroke-numbers">stroke-order numbers</a>, useful for
displaying in printed material for instance. The stroke numbers are
positioned near their corresponding strokes, at their starting
points.
</p>
<p class='undocumented'>
Undocumented and unknown features, and notes on possible flaws in
the data, are distinguished with a pale green background.
</p>
<h2 id="strokepaths-groups">StrokePaths groups</h2>
<p>
Kanji are often made of several components, referred to
as <a href="glossary.html#element">elements</a> in this
documentation. For instance,
<a href="viewer.html?kanji=頑">頑</a> can be seen as a combination of
<a href="viewer.html?kanji=元">元</a> on its left and
<a href="viewer.html?kanji=頁">頁</a> on its right. KanjiVG uses SVG
groups to reflect this structure, so the KanjiVG entry contains two
groups under the parent <code>StrokePaths</code> group containing
the left and right side of the kanji. The 元 and 頁 are encoded in
these groups under their <a href="#element">element</a>
attributes. SVG groups provide an elegant way to collect strokes
into a given group.
</p>
<p>
In the case that the elements do not consist of contiguous strokes,
KanjiVG uses the <a href="#part">part</a>
and <a href="#number">number</a> XML attributes to distinguish them.
</p>
<img src="https://raw.githubusercontent.com/KanjiVG/kanjivg/master/kanji/09811.svg" width="436" height="436">
<p>
This section explains the SVG attributes used in the groups under
the <strong>StrokePaths</strong> group which describe the structure
of a kanji, such as radicals and other sub-elements.
</p>
<h3>General attributes</h3>
<h4 id="group-id">id</h4>
<p>
The KanjiVG identification number for this group. It contains the
Unicode value of the kanji as a five-digit lower-case hexadecimal
number, followed by a hyphen and the letter "g", followed by a decimal
number from one to the total number of groups.
</p>
<p>
The group ID numbers are always consecutive positive whole numbers.
</p>
<h3>KVG namespace attributes</h3>
<p>
All these attributes are placed under the <strong>kvg</strong>
namespace, for example <code>kvg:original="家"</code>
</p>
<h4 id="element">element</h4>
<p>
This attribute specifies which kanji best represents the group
physically. It should be the Unicode character that resembles the
group as much as possible.
</p>
<p>
The value of <code>element</code> on the outermost group of the
strokes is the same as the kanji represented by the SVG.
</p>
<h4 id="number">number</h4>
<p>
This relatively rare attribute allows an element of a kanji to be
identified when it is both represented several times in the kanji,
and, due to the stroke order, more than one of these representations
is broken into parts, so that the <a href="#part">part attribute</a>
has to be used for more than one element. In other words, the number
attribute is a way to uniquely identify the part when it becomes
ambiguous.
</p>
<p>
It is only used in a few places in kanjivg where there are two
different sets of the same element, such as 05716.svg, the character
圖, where there are four 口 elements, two of which are broken into
parts one and two due to the stroke order. Please
inspect <a href="https://raw.githubusercontent.com/KanjiVG/kanjivg/master/kanji/05716.svg">the
source code of that SVG file</a> to understand
what <code>kvg:number</code> attribute does.
</p>
<p>
Generally, elements which can be represented by contiguous blocks of
strokes do not have a number attribute, even if multiple cases of
the same element occur in a character, so, for example, the 口
elements of 品 do not have a number attribute.
</p>
<h4 id="original">original</h4>
<p>
This attribute specifies which kanji represents the group from a
semantic point of view. This attribute only needs to be present if
there is a difference between the semantic and physical
representation of the group.
</p>
<p>
For example, <a href="viewer.html?kanji=仮">仮</a> has two groups.
The left one has <a href="viewer.html?kanji=亻">亻</a>
(called <em>ninben</em>) for its <code>element</code> attribute, and
<a href="viewer.html?kanji=人">人</a>, meaning "person", for
its <code>original</code> attribute, because <em>ninben</em> is a
variation of 人. However, the right side has
<a href="viewer.html?kanji=反">反</a> for <code>element</code>,
which is not a variation, so an <code>original</code> attribute is
not necessary.
</p>
<figure>
<pre>
<g id="kvg:StrokePaths_04eee" style="fill:none;stroke:#000000;stroke-width:3;stroke-linecap:round;stroke-linejoin:round;">
<g id="kvg:04eee" kvg:element="仮">
<g id="kvg:04eee-g1" <span style='color:red'>kvg:element="亻"</span> kvg:variant="true" <span style='color:red'>kvg:original="人"</span> kvg:position="left" kvg:radical="general">
<path id="kvg:04eee-s1" kvg:type="㇒" d="M32.01,17c0.22,1.93-0.31,3.72-1.02,5.37C26.5,32.93,20.8,42.85,10.5,55.7"/>
<path id="kvg:04eee-s2" kvg:type="㇑" d="M25.48,37.5c0.57,0.57,1,1.69,1,3.24c0,11.3,0,33.32,0,46.02c0,3.05,0,5.56,0,7.25"/>
</g>
<g id="kvg:04eee-g2" <span style='color:red'>kvg:element="反"</span> kvg:position="right" kvg:phon="叚V/反">
<g id="kvg:04eee-g3" kvg:element="厂">
<path id="kvg:04eee-s3" kvg:type="㇐" d="M47.34,22.01c1.27,0.33,3.61,0.53,4.86,0.33c11.42-1.84,23.3-4.59,32.75-5.84c2.08-0.27,3.38,0.16,4.44,0.32"/>
<path id="kvg:04eee-s4" kvg:type="㇒" d="M52.65,23.81c1.14,1.14,1.49,2.48,1.42,4.7c-0.74,22.33-5.06,47.31-16.01,61.4"/>
</g>
<g id="kvg:04eee-g4" kvg:element="又">
<path id="kvg:04eee-s5" kvg:type="㇇" d="M56.7,41.74c1.51,0.37,2.95,0.43,5.97-0.13c3.02-0.56,15.29-4.17,17.37-4.71c2.97-0.77,4.79,1.08,3.83,4.1c-6.62,20.75-11.62,38.63-36.26,53.52"/>
<path id="kvg:04eee-s6" kvg:type="㇏" d="M56.12,52.12c5.64,0.81,18.99,22.02,31.03,33.71c2.35,2.29,5.22,4.66,7.84,6.11"/>
</g>
</g>
</g>
</g>
</pre>
<figcaption>
Illustration of the structure of the KanjiVG file for 仮 (U+4EEE).
</figcaption>
</figure>
<h4 id="part">part</h4>
<p>
When the elements of a group of kanji strokes which forms a larger
unit are not consecutive strokes, the group of strokes may be spread
over several groups of paths in the file. The <strong>part</strong>
attribute allows numbering these groups and defines them as being
part of the same component. There is also a <a href="#number">number
attribute</a> which can be used in the rare cases that two groups
with the same element have non-consecutive strokes within the same
character.
</p>
<h4 id="partial">partial</h4>
<p>
Should be present and set to <strong>true</strong> if the group only
represents the <strong>element</strong> attribute partially, i.e. if
not all its strokes are present.
</p>
<h4 id="phon">phon</h4>
<p>
A large number of kanji consist of a radical and a phoneticum, the
Sino-Japanese pronunciation. The <code>phon</code> attribute should
mark the part indicating the pronunciation.
</p>
<p class='undocumented'>
The values of this attribute are inconsistent, and the meanings of
many of them are completely
undocumented. See
<a href="https://github.com/KanjiVG/kanjivg/issues/312">issue 312 on
Github</a>
for more details.
</p>
<h4 id="position">position</h4>
<p>
Defines where this groups is located with respect to the other
groups with the same parent. Not every element has a "position"
value. Possible values are
</p>
<dl>
<dt id="bottom">bottom</dt>
<dd>
This part is under another part.
</dd>
<dt id="kamae">kamae</dt>
<dd>
This part is wrapped around another part, such as 門.
<span class='undocumented'>This is used very inconsistently in KanjiVG
as a grab-bag for various different structures.</span>
</dd>
<dt id="left">left</dt>
<dd>
This part is left of another part.
</dd>
<dt id="nyo">nyo</dt>
<dd>
This part is left and under another part, such as 辶.
</dd>
<dt id="nyoc">nyoc</dt>
<dd>
This part is the complement or counterpart of a <a href="#nyo">nyo</a> part.
</dd>
<!-- add illustration here -->
<dt id="right">right</dt>
<dd>
This part is right of another part.
</dd>
<dt id="tare">tare</dt>
<dd>
This part is left and above another part, such as 广.
</dd>
<dt id="tarec">tarec</dt>
<dd>
This part is the complement or counterpart of a <a href="#tare">tare</a> part.
</dd>
<dt id="top">top</dt>
<dd>
This part is above another part.
</dd>
</dl>
<h4 id="radical">radical</h4>
<p>
This is set to a value if this group of strokes is considered
a <a href="glossary.html#radical">radical</a> of the kanji, and by
which reference. The value of the attribute depends on the
reference, as follows.
</p>
<dl>
<!-- Alphabetical order -->
<dt id="radical-general">
general
</dt>
<dd>
The generally accepted radical which authors agree on.
</dd>
<dt id="radical-jis">
jis
</dt>
<dd>
This marks the radicals used
by <a href="ref.html#jis-jiten-ref">JIS Kanji Jiten</a>, used
by <a href="ref.html#kanjidic">Kanjidic</a>, which sometimes
differ from the <code>general</code> or <code>tradit</code>
radicals. This value was added to deal with inconsistencies
between KanjiVG and Kanjidic and other references.
</dd>
<dt id="radical-nelson">
nelson
</dt>
<dd>
The keyword "nelson" is used for
<a href="glossary.html#nelson-radicals">Nelson radicals</a>.
</dd>
<dt id="radical-tradit">
tradit
</dt>
<dd>
The keyword "tradit" is used for the "traditional" radical, where
the <a href="glossary.html#kangxi-radicals">Kangxi radical</a>
disagrees with Nelson.
</dd>
</dl>
<figure>
<pre>
<g id="kvg:04e94" kvg:element="五">
<span style='color:red'><g id="kvg:04e94-g1" kvg:element="二" kvg:part="1" kvg:radical="tradit"></span>
<span style='color:red'><g id="kvg:04e94-g2" kvg:element="一" kvg:radical="nelson"></span>
<path id="kvg:04e94-s1" kvg:type="㇐" d="M31.75,23.15c2.8,0.67,5.54,0.42,8.36,0.12c9.3-0.99,22.18-2.4,34.14-3.21c2.49-0.17,5.04-0.33,7.5,0.2"/>
</g>
</g>
<path id="kvg:04e94-s2" kvg:type="㇑a" d="M55.75,25.25c0.62,1.25,1.02,3.01,0.5,5c-3.12,11.88-14,44.12-19.75,59"/>
<path id="kvg:04e94-s3" kvg:type="㇕c" d="M25.5,55.25c2.07,1.24,4.73,1.03,7,0.81c15.49-1.45,29.89-3.03,42.25-4.06c3-0.25,4.25,1.75,3.5,3.75c-2.24,5.96-6,20.75-7.75,31.5"/>
<span style='color:red'><g id="kvg:04e94-g3" kvg:element="二" kvg:part="2" kvg:radical="tradit"></span>
<path id="kvg:04e94-s4" kvg:type="㇐" d="M11.25,90.5c3.04,0.81,6.52,0.63,9.63,0.41c15.71-1.1,43.9-2.8,67.75-3.8c3.41-0.14,6.9-0.4,10.25,0.39"/>
</g>
</g>
</pre>
<figcaption>
The kanji for "five", 五, has a traditional Kangxi radical 二, which
is split into two parts, the upper and lower character, and a Nelson
radical of 一, which is the upper character. The two parts of the 二
are given <a href="#part">part</a> numbers of 1 and 2.
</figcaption>
</figure>
<p>
Unicode has more than one code point which may represent each
radical. The choices of radicals which have been used by KanjiVG are
explained on the <a href="radicals.html">Radicals</a> page.
</p>
<h4 id="radicalForm">radicalForm</h4>
<p>
This is set to the value <strong>true</strong> for a limited number
of groups where a radical-like form of a character described
by <a href="#original">original</a> is provided as
the <a href="#element">element</a>.
</p>
<h4 id="tradForm">tradForm</h4>
<p>
The Kanjidic file with which Ulrich Apel worked in the beginning
favored the radicals given in the Nelson character dictionary, which
sometimes differ from the radicals given in "traditional" Japanese
dictionaries and have mark-up as well.
</p>
<h4 id="variant">variant</h4>
<p class='undocumented'>
Unknown, possibly used to indicate that the shape of the element is
unlike the usual grapheme.
</p>
<h2>Strokes</h2>
<p>
Each individual kanji stroke is represented by one SVG <path>
element.
</p>
<h3>General attributes</h3>
<h4 id="d">d</h4>
<p>
The SVG path information itself. This describes the shape of the
line.
</p>
<p>
Although there is no rule disallowing various SVG elements, in
practice all of the KanjiVG data consists of cubic bezier curves. In
the SVG terminology the path is made up of only M/m, C/c, and S/s
elements. There are no other SVG path elements present. None of the
strokes contains a path with more than one sub-path, that is to say
there are no strokes with more than one "moveto" element.
</p>
<h4 id="stroke-id">id</h4>
<p>
The KanjiVG identification number for this stroke. It contains the
prefix <code>kvg:</code> followed by the Unicode value of the kanji
as a five-digit lower-case hexadecimal number, followed by any
variant information, followed by a hyphen and the letter "s",
followed by a decimal number from one to the total stroke count. For
example stroke 3 of the file <code>kanji/053ec.svg</code> has the ID
number <code>kvg:053ec-s3</code>.
</p>
<p>
The stroke IDs are consecutive positive whole numbers starting from
1 which correspond to <a href="#stroke-numbers">the stroke
number</a> of the stroke.
</p>
<h3>KVG namespace attributes</h3>
<p>
These attributes are under the <code>kvg:</code> namespace.
</p>
<h4 id="type">type</h4>
<p>
The shape of the stroke. It can be used to know how the stroke
should be rendered.
</p>
<p>
The values of this attribute use the keys of
Unicode's <a href="glossary.html#cjk-strokes">CJK Strokes</a>, which
occupy code positions from U+31C0 to U+31EF. The names of these,
such as D or HZ, are the initials of the Chinese names.
</p>
<p>
Please see the <a href="stroke-types.html">Stroke types</a> page for full
information on stroke types.
</p>
<h2 id="stroke-numbers">Stroke numbers</h2>
<p>
Stroke numbers are represented by a top-level group with an ID of
the form <code>kvg:StrokeNumbers_abcde</code>,
where <code>abcde</code> is the identifier of the file. This group
contains <code>text</code> elements. Each text element is located on
the diagram using a <code>transform</code> attribute. The text
within each text element is the stroke number in digits, from one to
the total number of strokes. The stroke numbers <em>should</em>
correspond to the <a href="#stroke-id">id value of the individual
strokes</a>.
</p>
<p>
The stroke numbers are located to the side of the beginning of the
stroke whose order they indicate. Generally, they should not overlap
the strokes.
</p>