Skip to content

名鉄線英語駅名修正#1420

Merged
TinyKitten merged 17 commits intodevfrom
data/meitetsu-caps
Mar 10, 2026
Merged

名鉄線英語駅名修正#1420
TinyKitten merged 17 commits intodevfrom
data/meitetsu-caps

Conversation

@TinyKitten
Copy link
Member

Summary

  • replace all-uppercase English station names on the Meitetsu Nagoya Main Line in data/3!stations.csv
  • normalize obvious spelling issues while converting to formal casing

Verification

  • confirmed all 60 line_cd=30001 rows no longer contain all-uppercase values in station_name_r / station_name_rn
  • tests not run (data-only CSV update)

@TinyKitten TinyKitten requested a review from 10mocy as a code owner March 10, 2026 08:50
@github-actions github-actions bot added the data label Mar 10, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2026

Important

Review skipped

Review was skipped due to path filters

⛔ Files ignored due to path filters (2)
  • data/2!lines.csv is excluded by !**/*.csv
  • data/3!stations.csv is excluded by !**/*.csv

CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including **/dist/** will override the default block on the dist directory, by removing the pattern from both the lists.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2b842263-055a-4ee9-9356-cd7b3891aed3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch data/meitetsu-caps

Comment @coderabbitai help to get the list of available commands and usage tips.

@TinyKitten TinyKitten changed the title Fix Meitetsu Nagoya Main Line station name casing 名鉄線英語駅名修正 Mar 10, 2026
@TinyKitten
Copy link
Member Author

@coderabbitai CSVの破損確認とか英語駅名のチェックとかできたりします?

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

もちろんです!CSVの整合性確認と英語駅名のチェックを行います。


🏁 Script executed:

#!/bin/bash
# Find the stations CSV file
fd -t f "stations" --extension csv | head -20

Length of output: 19


🏁 Script executed:

#!/bin/bash
# Also check for any CSV files in data directory
fd -t f "\.csv$" | head -30

Length of output: 173


🏁 Script executed:

#!/bin/bash
echo "=== ヘッダー確認 ==="
head -1 data/3!stations.csv

echo ""
echo "=== 総行数 ==="
wc -l data/3!stations.csv

echo ""
echo "=== 列数の一貫性チェック (ヘッダーと異なる行) ==="
python3 - <<'EOF'
import csv

header_cols = None
inconsistent_rows = []
with open("data/3!stations.csv", encoding="utf-8") as f:
    reader = csv.reader(f)
    for i, row in enumerate(reader):
        if i == 0:
            header_cols = len(row)
            print(f"ヘッダー列数: {header_cols}")
            print(f"カラム名: {row}")
            continue
        if len(row) != header_cols:
            inconsistent_rows.append((i+1, len(row), row))

if inconsistent_rows:
    print(f"\n列数が不整合な行:")
    for lineno, cols, row in inconsistent_rows:
        print(f"  行{lineno}: {cols}列 -> {row}")
else:
    print("\n列数の不整合: なし ✅")
EOF

echo ""
echo "=== NULLバイト・制御文字チェック ==="
python3 -c "
data = open('data/3!stations.csv', 'rb').read()
null_count = data.count(b'\x00')
print(f'NULLバイト数: {null_count}')
ctrl_chars = [b for b in data if b < 0x09 or (0x0e <= b <= 0x1f)]
print(f'不正制御文字数: {len(ctrl_chars)}')
if null_count == 0 and len(ctrl_chars) == 0:
    print('エンコード異常: なし ✅')
"

Length of output: 787


🏁 Script executed:

#!/bin/bash
echo "=== line_cd=30001 の行数 ==="
python3 - <<'EOF'
import csv

cols = None
rows_30001 = []
with open("data/3!stations.csv", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        if row.get("line_cd") == "30001":
            rows_30001.append(row)

print(f"line_cd=30001 の行数: {len(rows_30001)}")

print("\n=== 英語駅名一覧 (station_name_r / station_name_rn) ===")
print(f"{'station_cd':<12} {'station_name':<20} {'station_name_r':<35} {'station_name_rn':<35}")
print("-"*105)
for r in rows_30001:
    print(f"{r.get('station_cd',''):<12} {r.get('station_name',''):<20} {r.get('station_name_r',''):<35} {r.get('station_name_rn',''):<35}")

print("\n=== 全大文字チェック (station_name_r) ===")
all_upper_r = [r for r in rows_30001 if r.get("station_name_r","").isupper() and r.get("station_name_r","").strip()]
if all_upper_r:
    print(f"⚠️ 全大文字が残っている行: {len(all_upper_r)}")
    for r in all_upper_r:
        print(f"  {r.get('station_name','')} -> {r.get('station_name_r','')}")
else:
    print("全大文字の残存: なし ✅")

print("\n=== 全大文字チェック (station_name_rn) ===")
all_upper_rn = [r for r in rows_30001 if r.get("station_name_rn","").isupper() and r.get("station_name_rn","").strip()]
if all_upper_rn:
    print(f"⚠️ 全大文字が残っている行: {len(all_upper_rn)}")
    for r in all_upper_rn:
        print(f"  {r.get('station_name','')} -> {r.get('station_name_rn','')}")
else:
    print("全大文字の残存: なし ✅")

print("\n=== 空欄チェック (station_name_r / station_name_rn) ===")
empty_r  = [r for r in rows_30001 if not r.get("station_name_r","").strip()]
empty_rn = [r for r in rows_30001 if not r.get("station_name_rn","").strip()]
print(f"station_name_r  空欄: {len(empty_r)}")
print(f"station_name_rn 空欄: {len(empty_rn)}")
for r in empty_r:
    print(f"  [空欄 station_name_r ] station_cd={r.get('station_cd','')} station_name={r.get('station_name','')}")
for r in empty_rn:
    print(f"  [空欄 station_name_rn] station_cd={r.get('station_cd','')} station_name={r.get('station_name','')}")
EOF

Length of output: 6864


🏁 Script executed:

#!/bin/bash
echo "=== スペルパターンチェック: 先頭以外で大文字が含まれるか (CamelCase混入チェック) ==="
python3 - <<'EOF'
import csv, re

rows_30001 = []
with open("data/3!stations.csv", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        if row.get("line_cd") == "30001":
            rows_30001.append(row)

def has_suspect_casing(name):
    # 各単語の先頭以外に大文字があるか(CamelCaseや全大文字残り)
    words = name.split()
    for w in words:
        # 記号除去
        w_clean = re.sub(r"[^A-Za-z]", "", w)
        if not w_clean:
            continue
        if w_clean[1:] != w_clean[1:].lower():
            return True
    return False

suspect = [(r.get("station_name",""), r.get("station_name_r",""), r.get("station_name_rn",""))
           for r in rows_30001
           if has_suspect_casing(r.get("station_name_r","")) or has_suspect_casing(r.get("station_name_rn",""))]

if suspect:
    print("⚠️ 疑わしいケーシング:")
    for name, r, rn in suspect:
        print(f"  {name}: r='{r}' rn='{rn}'")
else:
    print("疑わしいケーシング: なし ✅")

print("\n=== 重複 station_cd チェック ==="  )
all_rows = []
with open("data/3!stations.csv", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for row in reader:
        all_rows.append(row)

from collections import Counter
cd_counts = Counter(r.get("station_cd","") for r in all_rows)
dups = [(cd, cnt) for cd, cnt in cd_counts.items() if cnt > 1]
if dups:
    print(f"⚠️ 重複 station_cd: {dups}")
else:
    print("重複 station_cd: なし ✅")
EOF

Length of output: 159


@TinyKitten チェック完了です!結果をまとめます。


✅ CSV 整合性チェック

チェック項目 結果
ヘッダー列数 23列(全行一致)
総行数 11,142行(ヘッダー除く)
列数の不整合 なし ✅
NULLバイト / 不正制御文字 なし ✅
重複 station_cd なし ✅

✅ line_cd=30001 英語駅名チェック

チェック項目 結果
対象行数 60行(PR説明と一致) ✅
全大文字残存(station_name_r なし ✅
全大文字残存(station_name_rn なし ✅
空欄 なし ✅

⚠️ 要確認: 島氏永 のケーシング不整合

1件だけ気になる箇所があります。

島氏永: Shima-Ujinaga

ハイフン後の U が大文字になっています。同路線の他のハイフン含む駅名はすべて 小文字 です:

駅名 現在の表記
岡崎公園前 Okazakikoen-mae
二ツ杁 Futatsu-iri
新川橋 Shinkawa-bashi
木曽川堤 Kisogawa-zutsumi
中京競馬場前 Chukyo-keibajo-mae
神宮前 Jingu-mae
島氏永 Shima-Ujinaga⚠️ ハイフン後が大文字

Shima-ujinaga に統一するほうが良いかもしれません。名鉄公式の表記も確認いただけると確実です。

@TinyKitten TinyKitten merged commit 8efa9db into dev Mar 10, 2026
6 checks passed
@TinyKitten TinyKitten deleted the data/meitetsu-caps branch March 10, 2026 10:27
@TinyKitten TinyKitten self-assigned this Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant