jamorasep.main

Attributes

parse

Classes

Kanamap

Morasep

Module Contents

class jamorasep.main.Kanamap(kanamap_csv: str | None = None)
kanamap
load_kanamap(kanamap_csv: str | None = None) Dict[str, Dict[str, str]]
__call__(kana: str) Dict[str, str]
get_2letter_morae() List[str]
lst_katakana() List[str]
header() List[str]
class jamorasep.main.Morasep(kanamap_csv: str | None = None)
kanamap
two_letter_morae
subscript = ['ァ', 'ィ', 'ゥ', 'ェ', 'ォ', 'ャ', 'ュ', 'ョ', 'ヮ', 'ぁ', 'ぃ', 'ぅ', 'ぇ', 'ぉ', 'ゃ', 'ゅ', 'ょ', 'ゎ']
upper
check_if_successive_2chars_compose_mora(i: str, j: str) List[str]

Check if the successive 2 characters compose a mora.

If so, return the mora. If not, return the list of morae depending on the relationship between the 2 characters.

Rules:

RULE 0: i + j forms a known two-letter mora (e.g., “キャ”, “シュ”) RULE 1: i is normal kana + j is subscript but not a valid mora -> convert subscript to normal (e.g., “カァ” -> [“カ”, “ア”]) RULE 2: i is subscript + j is normal kana -> return [] (handled by previous pair) RULE 3: i is subscript + j is subscript -> convert j to normal (e.g., “ァァ” -> [“ア”]) RULE 4: Otherwise -> return [i]

kana2mora(txt: str) List[str]

Convert a string of Japanese text (hiragana or katakana) into a list of morae.

Symbols and characters other than hiragana/katakana are separated character-wise and returned without modification.

Example

“あいうえお・きゃきゅきょ” -> [“あ”, “い”, “う”, “え”, “お”, “・”, “きゃ”, “きゅ”, “きょ”]

Parameters:

txt – A string of Japanese text (hiragana or katakana).

Returns:

A list of morae.

modify_special_mora(morae: List[str]) List[str]

Modify Q (ッ) in a romanized mora list.

  • If Q is the last mora, replace with a space.

  • If the next mora starts with a consonant, replace Q with that consonant.

  • If the next mora starts with a vowel or symbol, replace Q with a space.

Parameters:

morae – A list of morae.

Returns:

A list of morae with Q replaced.

_convert_to_hiragana(lst: List[str]) List[str]
_convert_to_katakana(lst: List[str]) List[str]
_convert_to_other_format(lst: List[str], output_format: str) List[str]
convert_lst_of_mora(lst: List[str], output_format: str = 'katakana', phoneme: bool = False) List[str]

Convert a list of morae into katakana, hiragana, or other formats.

Parameters:
  • lst – A list of morae.

  • output_format – The output format. Options are “katakana”, “hiragana”, and any column in kanamap.csv (e.g., “kunrei”, “hepburn”, “simple-ipa”).

  • phoneme – If True, split the output into individual phonemes. Only effective when output_format is a romanization format.

Returns:

A list of morae in the specified format.

parse(txt: str = '', output_format: str | None = None, phoneme: bool = False, **kwargs) List[str]

Convert a kana string into a list of morae.

Parameters:
  • txt – A string of katakana or hiragana.

  • output_format – The output format. If None, return morae as-is. Options are “katakana”, “hiragana”, and any column in kanamap.csv (e.g., “kunrei”, “hepburn”, “simple-ipa”).

  • phoneme – If True, split the output into individual phonemes.

Returns:

A list of morae.

jamorasep.main.parse