- Pecha.from_path()
- Pecha.create()
- Pecha.base_path()
- Pecha.layer_path()
- Pecha.metadata_path()
- Pecha.get_base()
- Pecha.set_base()
- Pecha.add_layer()
- Pecha.add_annotation()
- Pecha.set_metadata()
- Pecha.get_layer_by_ann_type()
- Pecha.publish()
- Pecha.merge_pecha()
- JsonSerializer.get_base()
- JsonSerializer.to_dict()
- JsonSerializer.get_edition_base()
- JsonSerializer.get_layer_paths()
- JsonSerializer.serialize()
- JsonSerializer.serialize_edition_annotations()
- DocxRootParser.parse()
- DocxRootParser.extract_anns()
- DocxRootParser.extract_segmentation_anns()
- DocxRootParser.extract_alignment_anns()
- DocxSimpleCommentaryParser.parse()
- DocxSimpleCommentaryParser.extract_anns()
- DocxSimpleCommentaryParser.extract_segmentation_anns()
- DocxSimpleCommentaryParser.extract_alignment_anns()
- DocxAnnotationUpdate.extract_layer_name()
- DocxAnnotationUpdate.extract_layer_id()
- DocxAnnotationUpdate.extract_layer_enum()
- DocxAnnotationUpdate.update_annotation()
- TranslationAlignmentTransfer.is_empty()
- TranslationAlignmentTransfer.get_segmentation_ann_path()
- TranslationAlignmentTransfer.map_layer_to_layer()
- TranslationAlignmentTransfer.get_root_pechas_mapping()
- TranslationAlignmentTransfer.get_translation_pechas_mapping()
- TranslationAlignmentTransfer.mapping_to_text_list()
- TranslationAlignmentTransfer.get_serialized_translation_alignment()
- TranslationAlignmentTransfer.get_serialized_translation_segmentation()
- CommentaryAlignmentTransfer.is_valid_ann()
- CommentaryAlignmentTransfer.get_segmentation_ann_path()
- CommentaryAlignmentTransfer.index_annotations_by_root()
- CommentaryAlignmentTransfer.map_layer_to_layer()
- CommentaryAlignmentTransfer.get_root_pechas_mapping()
- CommentaryAlignmentTransfer.get_commentary_pechas_mapping()
- CommentaryAlignmentTransfer.get_serialized_commentary()
- CommentaryAlignmentTransfer.get_serialized_commentary_segmentation()
- CommentaryAlignmentTransfer.format_serialized_commentary()
- CommentaryAlignmentTransfer.process_commentary_ann()
Loads a Pecha instance from a local path.
- Parameters:
pecha_path(Path): Path to the Pecha directory
- Returns:
Pechainstance - Example:
from pathlib import Path from openpecha.pecha import Pecha pecha = Pecha.from_path(Path("/path/to/pecha"))
Creates a new Pecha instance in the specified output directory.
- Parameters:
output_path(Path): Directory where the Pecha should be createdpecha_id(str, optional): Custom Pecha ID. If not provided, a new ID will be generated
- Returns:
Pechainstance - Example:
from pathlib import Path from openpecha.pecha import Pecha pecha = Pecha.create(Path("./output"))
Returns the path to the base directory which contains all the base files. If the directory does not exist, it is created.
- Returns: Path object pointing to the base directory
- Example:
base_dir = pecha.base_path print(base_dir) # /path/to/pecha/base
Returns the path to the layers directory which contains all the annotation files. If the directory does not exist, it is created.
- Returns: Path object pointing to the layers directory
- Example:
layer_dir = pecha.layer_path print(layer_dir) # /path/to/pecha/layers
Returns the path to the metadata file.
- Returns: Path object pointing to the metadata file
- Example:
metadata_file = pecha.metadata_path print(metadata_file) # /path/to/pecha/metadata.json
Gets the content of a base file by its name.
- Parameters:
base_name(str): Name of the base file
- Returns: str containing the base text content
- Example:
base_text = pecha.get_base("base1")
Sets the content of a base file.
- Parameters:
content(str): Text content to write to the base filebase_name(str, optional): Name for the base file. If not provided, a new ID will be generated
- Returns: str containing the base name
- Example:
base_name = pecha.set_base("This is the text content", "base1")
Adds a new annotation layer for a given base.
- Parameters:
base_name(str): Name of the base file to associate with this layerlayer_type(AnnotationType): Type of annotation layer (must be included in AnnotationType enum)
- Returns: Tuple of (AnnotationStore, Path) containing:
- AnnotationStore: The created annotation store
- Path: Path to the layer file
- Example:
from openpecha.pecha.layer import AnnotationType # Add a segmentation layer layer, layer_path = pecha.add_layer("base1", AnnotationType.SEGMENTATION) # Add a chapter layer layer, layer_path = pecha.add_layer("base1", AnnotationType.CHAPTER)
- Note: The layer file will be created with a name format of
{layer_type}-{random_id}.jsonin the layers directory under the base name folder.
Adds an annotation to an existing annotation layer (Annotation Store).
- Parameters:
ann_store(AnnotationStore): The annotation store/layer to add the annotation toannotation(BaseAnnotation): The annotation object to add (e.g., SegmentationAnnotation, CitationAnnotation)layer_type(AnnotationType): The type of annotation (must match the layer type)
- Returns: AnnotationStore with the added annotation
- Example:
from openpecha.pecha.annotations import Span, SegmentationAnnotation from openpecha.pecha.layer import AnnotationType # Create a segmentation annotation ann = SegmentationAnnotation(span=Span(start=0, end=10), index=1) # Add the annotation to the layer layer = pecha.add_annotation(layer, ann, AnnotationType.SEGMENTATION) # Save the layer after adding annotations layer.save()
- Note:
- The annotation's span must be valid for the base text
- The layer_type must match the type of annotation being added
- The layer must be saved after adding annotations to persist the changes
Updates the Pecha's metadata with new values while preserving existing metadata fields if not overridden.
- Parameters:
pecha_metadata(Dict): Dictionary containing metadata fields to update. Can include:title(Dict[str, str] | str): Title in different languages or single languageauthor(List[str] | Dict[str, str] | str): Author(s) informationlanguage(str): Language code (e.g., 'bo', 'en')parser(str): Name of the parser usedinitial_creation_type(str): How the Pecha was createdsource_metadata(Dict): Additional source informationcopyright(Dict): Copyright informationlicence(str): License type
- Returns: Updated PechaMetaData object
- Example:
# Update metadata with new values pecha.set_metadata({ "title": {"en": "New Title", "bo": "གསར་བཅོས་ཁ་བྱང་།"}, "author": ["Author 1", "Author 2"], "language": "bo", "source_metadata": { "id": "source123", "publisher": "Publisher Name" } }) # Update specific fields while preserving others pecha.set_metadata({ "title": {"en": "Updated Title"}, "copyright": { "year": "2024", "holder": "Copyright Holder" } })
- Note:
- Existing metadata fields not included in the update dictionary will be preserved
- The parser and initial_creation_type fields will be preserved from existing metadata if not specified
- The metadata is automatically saved to the metadata.json file
- Invalid metadata will raise a ValueError
Pecha.get_layer_by_ann_type() -> Union[Tuple[AnnotationStore, Path], Tuple[List[AnnotationStore], List[Path]]]
Gets layers by annotation type.
- Parameters:
base_name(str): Name of the base filelayer_type(AnnotationType): Type of annotation to retrieve
- Returns: Tuple of (AnnotationStore or list of AnnotationStore, Path or list of Path)
- Example:
layer, layer_path = pecha.get_layer_by_ann_type("base1", AnnotationType.SEGMENTATION)
Publishes the Pecha to GitHub and optionally creates a release with assets.
- Parameters:
asset_path(Path, optional): Path to the asset directoryasset_name(str, optional): Name for the asset. Defaults to "source_data"branch(str, optional): Branch to publish to. Defaults to "main"is_private(bool, optional): Whether the repository should be private. Defaults to False
- Example:
pecha.publish( asset_path=Path("./assets"), asset_name="source_data", branch="main", is_private=False )
Merges the layers of a source pecha into the current pecha.
- Parameters:
source_pecha(Pecha): The source Pecha instancesource_base_name(str): The base name of the source pechatarget_base_name(str): The base name of the target (current) pecha
- Example:
pecha.merge_pecha(source_pecha, "source_base", "target_base")
Returns the base text from the first base in the given Pecha.
- Parameters:
pecha(Pecha): The Pecha object to extract the base from
- Returns: str containing the base text
- Example:
from openpecha.pecha.serializers.json_serializer import JsonSerializer base = JsonSerializer().get_base(pecha)
Converts an AnnotationStore to a list of annotation dictionaries for the given annotation type.
- Parameters:
ann_store(AnnotationStore): The annotation store to convertann_type(AnnotationType): The type of annotation
- Returns: List of annotation dictionaries
- Example:
anns = JsonSerializer.to_dict(ann_store, AnnotationType.SEGMENTATION)
Constructs a new base text by applying version variant operations (insertions/deletions) from an edition layer.
- Parameters:
pecha(Pecha): The Pecha objectedition_layer_path(str): Path to the edition layer file
- Returns: str containing the constructed edition base text
- Example:
from openpecha.pecha.serializers.json import JsonSerializer edition_base = JsonSerializer().get_edition_base(pecha, "4C00/version-9D95.json")
Extracts layer paths from a Pecha based on annotation information.
- Parameters:
pecha(Pecha): The Pecha objectannotations(list[dict]): List of annotation dictionaries with 'type' and 'name' keys
- Returns: List of layer paths
- Example:
from openpecha.pecha.serializers.json import JsonSerializer annotations = [{"type": "segmentation", "name": "4FD1"}] layer_paths = JsonSerializer().get_layer_paths(pecha, annotations)
Serializes a Pecha with its annotations based on manifestation information, returning base text and annotations.
- Parameters:
pecha(Pecha): The Pecha object to serializemanifestation_info(dict, optional): Dictionary containing annotation information
- Returns: Dictionary with keys
base(str) andannotations(dict of annotation lists) - Example:
from openpecha.pecha.serializers.json import JsonSerializer manifestation_info = { "annotations": [{"type": "segmentation", "name": "4FD1"}] } result = JsonSerializer().serialize(pecha, manifestation_info)
JsonSerializer.serialize_edition_annotations(pecha: Pecha, edition_layer_path: str, layer_path: str) -> dict
Serializes annotations that are based on an edition base rather than the original base.
- Parameters:
pecha(Pecha): The Pecha objectedition_layer_path(str): Path to the edition layer filelayer_path(str): Path to the layer file to serialize
- Returns: Dictionary with keys
base(str) andannotations(dict of annotation lists) - Example:
from openpecha.pecha.serializers.json import JsonSerializer result = JsonSerializer().serialize_edition_annotations( pecha, "4C00/version-9D95.json", "4C00/segmentation-4FD1.json" )
Parses a DOCX file and creates a Pecha object with annotations.
- Parameters:
input(str | Path): Path to the DOCX file to be parsedannotation_type(AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT)metadata(Dict): Dictionary containing metadata for the Pechaoutput_path(Path, optional): Directory where the Pecha should be created. Defaults to PECHAS_PATHpecha_id(str | None, optional): Custom Pecha ID. If not provided, a new ID will be generated
- Returns: Tuple containing:
- Pecha: The created Pecha instance
- annotation_path: Path to the created annotation layer file
- Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() pecha, layer_path = parser.parse( input="path/to/file.docx", annotation_type=AnnotationType.SEGMENTATION, metadata={"title": "Sample Title"}, output_path=Path("./output") )
Extracts text and annotations from a DOCX file.
- Parameters:
docx_file(Path): Path to the DOCX fileannotation_type(AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT)
- Returns: Tuple containing:
- List[BaseAnnotation]: List of extracted annotations
- str: The extracted base text
- Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() anns, base = parser.extract_anns( Path("path/to/file.docx"), AnnotationType.SEGMENTATION )
Extracts segmentation annotations from numbered text.
- Parameters:
numbered_text(Dict[str, str]): Dictionary mapping segment numbers to text content
- Returns: Tuple containing:
- List[SegmentationAnnotation]: List of segmentation annotations
- str: The concatenated base text
- Example:
from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() numbered_text = { "1": "First segment", "2": "Second segment" } anns, base = parser.extract_segmentation_anns(numbered_text)
Extracts alignment annotations from numbered text.
- Parameters:
numbered_text(Dict[str, str]): Dictionary mapping segment numbers to text content
- Returns: Tuple containing:
- List[AlignmentAnnotation]: List of alignment annotations
- str: The concatenated base text
- Example:
from openpecha.pecha.parsers.docx.root import DocxRootParser parser = DocxRootParser() numbered_text = { "1": "First segment", "2": "Second segment" } anns, base = parser.extract_alignment_anns(numbered_text)
Parses a DOCX file and creates a commentary Pecha object with annotations.
- Parameters:
input(str | Path): Path to the DOCX file to be parsedannotation_type(AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT)metadata(Dict[str, Any]): Dictionary containing metadata for the Pechaoutput_path(Path, optional): Directory where the Pecha should be created. Defaults to PECHAS_PATHpecha_id(str | None, optional): Custom Pecha ID. If not provided, a new ID will be generated
- Returns: Tuple containing:
- Pecha: The created Pecha instance
- annotation_path: Path to the created annotation layer file
- Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() pecha, layer_path = parser.parse( input="path/to/commentary.docx", annotation_type=AnnotationType.ALIGNMENT, metadata={"title": "Commentary Title", "type": "commentary", "parent": "P0001"}, output_path=Path("./output") )
Extracts text and annotations from a commentary DOCX file.
- Parameters:
docx_file(Path): Path to the DOCX fileannotation_type(AnnotationType): Type of annotation to extract (SEGMENTATION or ALIGNMENT)
- Returns: Tuple containing:
- List[BaseAnnotation]: List of extracted annotations
- str: The extracted base text
- Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() anns, base = parser.extract_anns( Path("path/to/commentary.docx"), AnnotationType.ALIGNMENT )
Extracts segmentation annotations from numbered commentary text.
- Parameters:
numbered_text(Dict[str, str]): Dictionary mapping segment numbers to text content
- Returns: Tuple containing:
- List[SegmentationAnnotation]: List of segmentation annotations
- str: The concatenated base text
- Example:
from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() numbered_text = { "1": "First commentary segment", "2": "Second commentary segment" } anns, base = parser.extract_segmentation_anns(numbered_text)
Extracts alignment annotations from numbered commentary text, handling root text references.
- Parameters:
numbered_text(Dict[str, str]): Dictionary mapping segment numbers to text content
- Returns: Tuple containing:
- List[AlignmentAnnotation]: List of alignment annotations with root text references
- str: The concatenated base text
- Example:
from openpecha.pecha.parsers.docx.commentary.simple import DocxSimpleCommentaryParser parser = DocxSimpleCommentaryParser() numbered_text = { "1": "1-2 First commentary segment", "2": "3-4 Second commentary segment" } anns, base = parser.extract_alignment_anns(numbered_text)
- Note: The commentary text can include root text references in the format "1-2 Commentary text" where "1-2" refers to the root text segments being commented on.
Adds annotations to an existing Pecha from a DOCX file.
- Parameters:
pecha(Pecha): The Pecha instance to add annotations totype(AnnotationType | str): Type of annotation to extract (ALIGNMENT, SEGMENTATION, or FOOTNOTE)docx_file(Path): Path to the DOCX file containing annotationsmetadatas(List[Any]): List of metadata objects to determine if the Pecha is root-related
- Returns: Tuple containing:
- Pecha: The updated Pecha instance
- annotation_path: Path to the created annotation layer file
- Example:
from pathlib import Path from openpecha.pecha.layer import AnnotationType from openpecha.pecha.parsers.docx.annotation import DocxAnnotationParser parser = DocxAnnotationParser() pecha, layer_path = parser.add_annotation( pecha=existing_pecha, type=AnnotationType.FOOTNOTE, docx_file=Path("path/to/annotations.docx"), metadatas=[metadata] )
- Note:
- The parser supports three types of annotations: ALIGNMENT, SEGMENTATION, and FOOTNOTE
- For FOOTNOTE annotations, it uses DocxFootnoteParser
- For root-related Pechas, it uses DocxRootParser
- For other cases, it uses DocxSimpleCommentaryParser
- The coordinates of annotations are automatically updated to match the base text
Extracts the layer name from a layer path.
- Parameters:
layer_path(str): Path to the layer file
- Returns: str containing the layer name (filename without extension)
- Example:
updater = DocxAnnotationUpdate() layer_name = updater.extract_layer_name("path/to/segmentation-1234.json") print(layer_name) # "segmentation-1234"
Extracts the layer ID from a layer path.
- Parameters:
layer_path(str): Path to the layer file
- Returns: str containing the layer ID (last part of the filename after the hyphen)
- Example:
updater = DocxAnnotationUpdate() layer_id = updater.extract_layer_id("path/to/segmentation-1234.json") print(layer_id) # "1234"
Extracts the annotation type from a layer path.
- Parameters:
layer_path(str): Path to the layer file
- Returns: AnnotationType enum value corresponding to the layer type
- Example:
updater = DocxAnnotationUpdate() layer_type = updater.extract_layer_enum("path/to/segmentation-1234.json") print(layer_type) # AnnotationType.SEGMENTATION
Updates annotations in an existing Pecha from a DOCX file while preserving the layer ID.
- Parameters:
pecha(Pecha): The Pecha instance to update annotations inannotation_path(str): Path to the existing annotation layer filedocx_file(Path): Path to the DOCX file containing new annotationsmetadatas(List[Any]): List of metadata objects to determine if the Pecha is root-related
- Returns: Updated Pecha instance
- Example:
from pathlib import Path from openpecha.pecha.parsers.docx.update import DocxAnnotationUpdate updater = DocxAnnotationUpdate() updated_pecha = updater.update_annotation( pecha=existing_pecha, annotation_path="path/to/segmentation-1234.json", docx_file=Path("path/to/updated_annotations.docx"), metadatas=[metadata] )
- Note:
- The method preserves the original layer ID when updating annotations
- It automatically determines the annotation type from the existing layer path
- Uses DocxAnnotationParser internally to handle the actual annotation update
Checks if a text string is empty (contains only whitespace and newlines).
- Parameters:
text(str): The text to check
- Returns: bool indicating if the text is empty
- Example:
transfer = TranslationAlignmentTransfer() is_empty = transfer.is_empty(" \n ") # True is_empty = transfer.is_empty("Some text") # False
Gets the path to the first segmentation layer JSON file in a Pecha.
- Parameters:
pecha(Pecha): The Pecha instance to search in
- Returns: Path object pointing to the segmentation layer file
- Example:
transfer = TranslationAlignmentTransfer() seg_path = transfer.get_segmentation_ann_path(pecha)
Maps annotations from source layer to target layer based on span overlap or containment.
- Parameters:
src_layer(AnnotationStore): Source annotation layertgt_layer(AnnotationStore): Target annotation layer
- Returns: Dictionary mapping source indices to lists of target indices
- Example:
transfer = TranslationAlignmentTransfer() mapping = transfer.map_layer_to_layer(source_layer, target_layer)
- Note:
- Maps based on span overlap or containment
- Excludes edge overlaps
- Returns a sorted dictionary
Gets mapping from a Pecha's alignment layer to its segmentation layer.
- Parameters:
pecha(Pecha): The Pecha instancealignment_id(str): ID of the alignment layer
- Returns: Dictionary mapping alignment indices to segmentation indices
- Example:
transfer = TranslationAlignmentTransfer() mapping = transfer.get_root_pechas_mapping(pecha, "alignment-1234.json")
Gets mapping from segmentation to alignment layer in a translation Pecha.
- Parameters:
pecha(Pecha): The translation Pecha instancealignment_id(str): ID of the alignment layersegmentation_id(str): ID of the segmentation layer
- Returns: Dictionary mapping segmentation indices to alignment indices
- Example:
transfer = TranslationAlignmentTransfer() mapping = transfer.get_translation_pechas_mapping( pecha, "alignment-1234.json", "segmentation-5678.json" )
Flattens a mapping from translation to root text into a list of texts.
- Parameters:
mapping(Dict[int, List[str]]): Mapping of indices to text lists
- Returns: List of texts, with empty strings for missing indices
- Example:
transfer = TranslationAlignmentTransfer() texts = transfer.mapping_to_text_list({1: ["text1"], 3: ["text2"]}) # ["text1", "", "text2"]
Serializes root translation alignment text mapped to root segmentation text.
- Parameters:
root_pecha(Pecha): The root Pecha instanceroot_alignment_id(str): ID of the root alignment layerroot_translation_pecha(Pecha): The translation Pecha instancetranslation_alignment_id(str): ID of the translation alignment layer
- Returns: List of texts aligned with root segmentation
- Example:
transfer = TranslationAlignmentTransfer() texts = transfer.get_serialized_translation_alignment( root_pecha, "alignment-1234.json", translation_pecha, "alignment-5678.json" )
Serializes root translation segmentation text mapped to root segmentation text.
- Parameters:
root_pecha(Pecha): The root Pecha instanceroot_alignment_id(str): ID of the root alignment layertranslation_pecha(Pecha): The translation Pecha instancetranslation_alignment_id(str): ID of the translation alignment layertranslation_segmentation_id(str): ID of the translation segmentation layer
- Returns: List of texts aligned with root segmentation
- Example:
transfer = TranslationAlignmentTransfer() texts = transfer.get_serialized_translation_segmentation( root_pecha, "alignment-1234.json", translation_pecha, "alignment-5678.json", "segmentation-9012.json" )
Checks if an annotation is valid (exists and has non-empty text).
- Parameters:
anns(Dict[int, Dict[str, Any]]): Dictionary of annotationsidx(int): Index to check
- Returns: bool indicating if the annotation is valid
- Example:
transfer = CommentaryAlignmentTransfer() is_valid = transfer.is_valid_ann(annotations, 1)
Gets the path to the first segmentation layer JSON file in a Pecha.
- Parameters:
pecha(Pecha): The Pecha instance to search in
- Returns: Path object pointing to the segmentation layer file
- Example:
transfer = CommentaryAlignmentTransfer() seg_path = transfer.get_segmentation_ann_path(pecha)
Indexes annotations by their root index.
- Parameters:
anns(List[Dict[str, Any]]): List of annotation dictionaries
- Returns: Dictionary mapping root indices to annotation dictionaries
- Example:
transfer = CommentaryAlignmentTransfer() indexed_anns = transfer.index_annotations_by_root(annotations)
Maps annotations from source layer to target layer based on span overlap or containment.
- Parameters:
src_layer(AnnotationStore): Source annotation layertgt_layer(AnnotationStore): Target annotation layer
- Returns: Dictionary mapping source indices to lists of target indices
- Example:
transfer = CommentaryAlignmentTransfer() mapping = transfer.map_layer_to_layer(source_layer, target_layer)
- Note:
- Maps based on span overlap or containment
- Excludes edge overlaps
- Returns a sorted dictionary
- Handles complex alignment indices (e.g., "1,2-4")
Gets mapping from a Pecha's alignment layer to its segmentation layer.
- Parameters:
pecha(Pecha): The Pecha instancealignment_id(str): ID of the alignment layer
- Returns: Dictionary mapping alignment indices to segmentation indices
- Example:
transfer = CommentaryAlignmentTransfer() mapping = transfer.get_root_pechas_mapping(pecha, "alignment-1234.json")
Gets mapping from commentary Pecha's segmentation layer to alignment layer.
- Parameters:
pecha(Pecha): The commentary Pecha instancealignment_id(str): ID of the alignment layersegmentation_id(str): ID of the segmentation layer
- Returns: Dictionary mapping segmentation indices to alignment indices
- Example:
transfer = CommentaryAlignmentTransfer() mapping = transfer.get_commentary_pechas_mapping( pecha, "alignment-1234.json", "segmentation-5678.json" )
Serializes commentary annotations with root/segmentation mapping and formatting.
- Parameters:
root_pecha(Pecha): The root Pecha instanceroot_alignment_id(str): ID of the root alignment layercommentary_pecha(Pecha): The commentary Pecha instancecommentary_alignment_id(str): ID of the commentary alignment layer
- Returns: List of formatted commentary texts
- Example:
transfer = CommentaryAlignmentTransfer() texts = transfer.get_serialized_commentary( root_pecha, "alignment-1234.json", commentary_pecha, "alignment-5678.json" )
Serializes commentary segmentation annotations with root/segmentation mapping and formatting.
- Parameters:
root_pecha(Pecha): The root Pecha instanceroot_alignment_id(str): ID of the root alignment layercommentary_pecha(Pecha): The commentary Pecha instancecommentary_alignment_id(str): ID of the commentary alignment layercommentary_segmentation_id(str): ID of the commentary segmentation layer
- Returns: List of formatted commentary texts
- Example:
transfer = CommentaryAlignmentTransfer() texts = transfer.get_serialized_commentary_segmentation( root_pecha, "alignment-1234.json", commentary_pecha, "alignment-5678.json", "segmentation-9012.json" )
Formats a commentary text with chapter and segment information.
- Parameters:
chapter_num(int): Chapter numberseg_idx(int): Segment indextext(str): Commentary text
- Returns: Formatted string in the format "text"
- Example:
transfer = CommentaryAlignmentTransfer() formatted = transfer.format_serialized_commentary(1, 2, "Commentary text") # "<1><2>Commentary text"
Processes a single commentary annotation and returns the serialized string.
- Parameters:
ann(dict): The commentary annotation to processroot_anns(dict): Dictionary of root annotationsroot_map(dict): Mapping from root alignment to segmentationroot_segmentation_anns(dict): Dictionary of root segmentation annotations
- Returns: Formatted commentary string or None if not valid
- Example:
transfer = CommentaryAlignmentTransfer() result = transfer.process_commentary_ann( commentary_ann, root_anns, root_map, root_segmentation_anns )