■Windows版Rubyの細道・けもの道

■ナビゲータ

[南北館(最初のメニュー)]

  1. [Windows版Rubyの細道・けもの道]
    1. [1.準備編]
    2. [2.基本編]
    3. [3.応用編]
      1. [3-1.固定長データとCSVデータとの変換]
      2. [3-2.重複データの処理]
      3. [3-3.フォルダ内の一括処理]
      4. [3-4.1つのファイルを複数のファイルに分割する]
      5. [3-5.文字コードの変換]
      6. [3-6.半角全角変換]
      7. [3-7.多次元配列の処理]
      8. [3-9.その他]
    4. [スクリプトと入力データのサンプル]
Perlではどう処理する?
同じことをPerlではこうしています。

3.応用編

3-4.1つのファイルを複数のファイルに分割する

3-3-2.複数のファイルをまとめて、1つのファイルにする」とは逆に1つの入力ファイルから複数の出力ファイルに振り分ける方法について解説します。ここでは、日本郵政公社(2003年3月までは郵政事業庁)が提供している郵便番号データを都道府県別に出力する方法を例にとって説明します。郵便番号データは下記のサイトにあるので参照してください。

http://www.post.japanpost.jp/zipcode/

入力データの1項目目はJISで定められた都道府県コード(2バイト)と市区町村コード(3バイト)の2つを組み合わせた5バイトのデータになっています。そこで、入力データの1項目目から先頭2バイトを取り、そのデータに基づいて、出力ファイルを分けて出力します。

具体的には、【スクリプト】その1を見ていただければわかるように、Rubyでは、ioクラスのオブジェクトに変数を指定することはできませんが、openメソッドの引数で指定するファイル名には変数が指定できますので、入力データの1つとして、県別の出力ファイル名を指定したファイルをあらかじめ用意しておくことで、スクリプト自体がすっきりとした形になります。openメソッドを定数で指定した場合、どのようになるかを【スクリプト】その2で提示していますので、比較すると、どのくらい簡潔になるかがわかるでしょう。入出力ファイルが増えてきた場合、このように変数で指定することでスクリプトがすっきりと表示できるようになります。

なお、入力データはキー順に並べ替えてあることが前提となっています。Rubyを利用して並べ替えを行う場合については、[2-4.ソート(並べ替え)処理]を参照してください。

【入力データ(郵便番号データ)サンプル】
01101,"060  ","0600000","ホッカイドウ","サッポロシチュウオウク","イカニケイサイガナイバアイ","北海道","札幌市中央区","以下に掲載がない場合",0,0,0,0,0,0   
01101,"064  ","0640941","ホッカイドウ","サッポロシチュウオウク","アサヒガオカ","北海道","札幌市中央区","旭ケ丘",0,0,1,0,0,0
   
【入力データ(県別の出力ファイル名)】
01,01HOKKAIDO.CSV
02,02AOMORI.CSV
03,03IWATE.CSV
04,04MIYAGI.CSV
05,05AKITA.CSV
06,06YAMAGATA.CSV
07,07FUKUSHIMA.CSV
08,08IBARAKI.CSV
09,09TOCHIGI.CSV
10,10GUMMA.CSV
11,11SAITAMA.CSV
12,12CHIBA.CSV
13,13TOKYO.CSV
14,14KANAGAWA.CSV
15,15NIIGATA.CSV
16,16TOYAMA.CSV
17,17ISHIKAWA.CSV
18,18FUKUI.CSV
19,19YAMANASHI.CSV
20,20NAGANO.CSV
21,21GIFU.CSV
22,22SHIZUOKA.CSV
23,23AICHI.CSV
24,24MIE.CSV
25,25SHIGA.CSV
26,26KYOTO.CSV
27,27OSAKA.CSV
28,28HYOGO.CSV
29,29NARA.CSV
30,30WAKAYAMA.CSV
31,31TOTTORI.CSV
32,32SHIMANE.CSV
33,33OKAYAMA.CSV
34,34HIROSHIMA.CSV
35,35YAMAGUCHI.CSV
36,36TOKUSHIMA.CSV
37,37KAGAWA.CSV
38,38EHIME.CSV
39,39KOCHI.CSV
40,40FUKUOKA.CSV
41,41SAGA.CSV
42,42NAGASAKI.CSV
43,43KUMAMOTO.CSV
44,44OITA.CSV
45,45MIYAZAKI.CSV
46,46KAGOSHIMA.CSV
47,47OKINAWA.CSV
    
【スクリプト】その1
# ken1.rb   
# 内容 : 郵便番号データを県別に分割するスクリプト  
# Copyright (c) 2002-2015 Mitsuo Minagawa, All rights reserved. 
# (minagawa@fb3.so-net.ne.jp)   
# 使用方法 : c:\>ruby ken1.rb   
#   
# 入力ファイル(郵便番号データ)  
in1_file    =   open("ken_all.csv","r") 
# 入力ファイル(都道府県名データ)    
in2_file    =   open("ken.txt","r") 

in2_ctr =   0   #入力件数(都道府県名データ) 
sv_ken  =   0   #都道府県コードの保存コード 
w_file_nm   =   Array.new() #都道府県名の配列   

# 都道府県名のファイルを読み込む    
while   (line2  =   in2_file.gets)  
    line2.chomp!    
    in2     =   line2.split(",",-1) 
    in2_ctr +=  1   
    w_file_nm[in2_ctr]  =   in2[1]  
end     

# 主処理    
while   (line1  =   in1_file.gets)  
    line1.chomp!    

# CSV形式のファイルを配列に格納する 
    in1     =   (line1 + ',')   
                .scan(/"([^"\\]*(?:\\.[^"\\]*)*)",|([^,]*),/)   
                .collect{|x,y|  y || x.gsub(/\\(.)/, '\1')} 
    w_ken   =   in1[0].slice(0,2).to_i  

    if      (w_ken      !=  sv_ken) 
            w_file  =   w_file_nm[w_ken]    
            out1_file   =   open(w_file,"w")    
    end 

# 入力キーを保存    
    sv_ken  =   w_ken   
# 都道府県別のファイルを出力    
    out1_file.print line1,"\n"  
end 

# ファイルのクローズ    
in1_file.close  
   
【スクリプト】その2
# ken2.rb   
# 内容 : 郵便番号データを県別に分割するスクリプト  
# Copyright (c) 2002-2015 Mitsuo Minagawa, All rights reserved. 
# (minagawa@fb3.so-net.ne.jp)   
# 使用方法 : c:\>ruby ken2.rb   
#   
# 入力ファイル  
in1_file    =   open("ken_all.csv","r") 
# 出力ファイル  
out01_file  =   open("01HOKKAIDO.CSV","w")  
out02_file  =   open("02AOMORI.CSV","w")    
out03_file  =   open("03IWATE.CSV","w") 
out04_file  =   open("04MIYAGI.CSV","w")    
out05_file  =   open("05AKITA.CSV","w") 
out06_file  =   open("06YAMAGATA.CSV","w")  
out07_file  =   open("07FUKUSHIMA.CSV","w") 
out08_file  =   open("08IBARAKI.CSV","w")   
out09_file  =   open("09TOCHIGI.CSV","w")   
out10_file  =   open("10GUMMA.CSV","w") 
out11_file  =   open("11SAITAMA.CSV","w")   
out12_file  =   open("12CHIBA.CSV","w") 
out13_file  =   open("13TOKYO.CSV","w") 
out14_file  =   open("14KANAGAWA.CSV","w")  
out15_file  =   open("15NIIGATA.CSV","w")   
out16_file  =   open("16TOYAMA.CSV","w")    
out17_file  =   open("17ISHIKAWA.CSV","w")  
out18_file  =   open("18FUKUI.CSV","w") 
out19_file  =   open("19YAMANASHI.CSV","w") 
out20_file  =   open("20NAGANO.CSV","w")    
out21_file  =   open("21GIFU.CSV","w")  
out22_file  =   open("22SHIZUOKA.CSV","w")  
out23_file  =   open("23AICHI.CSV","w") 
out24_file  =   open("24MIE.CSV","w")   
out25_file  =   open("25SHIGA.CSV","w") 
out26_file  =   open("26KYOTO.CSV","w") 
out27_file  =   open("27OSAKA.CSV","w") 
out28_file  =   open("28HYOGO.CSV","w") 
out29_file  =   open("29NARA.CSV","w")  
out30_file  =   open("30WAKAYAMA.CSV","w")  
out31_file  =   open("31TOTTORI.CSV","w")   
out32_file  =   open("32SHIMANE.CSV","w")   
out33_file  =   open("33OKAYAMA.CSV","w")   
out34_file  =   open("34HIROSHIMA.CSV","w") 
out35_file  =   open("35YAMAGUCHI.CSV","w") 
out36_file  =   open("36TOKUSHIMA.CSV","w") 
out37_file  =   open("37KAGAWA.CSV","w")    
out38_file  =   open("38EHIME.CSV","w") 
out39_file  =   open("39KOCHI.CSV","w") 
out40_file  =   open("40FUKUOKA.CSV","w")   
out41_file  =   open("41SAGA.CSV","w")  
out42_file  =   open("42NAGASAKI.CSV","w")  
out43_file  =   open("43KUMAMOTO.CSV","w")  
out44_file  =   open("44OITA.CSV","w")  
out45_file  =   open("45MIYAZAKI.CSV","w")  
out46_file  =   open("46KAGOSHIMA.CSV","w") 
out47_file  =   open("47OKINAWA.CSV","w")   

# 主処理    
while   (line1  =   in1_file.gets)      
    line1.chomp!        

# CSV形式のファイルを配列に格納する 
    in1     =   (line1 + ',')   
                .scan(/"([^"\\]*(?:\\.[^"\\]*)*)",|([^,]*),/)   
                .collect{|x,y| y || x.gsub(/\\(.)/, '\1')}  
    w_ken   =   in1[0].slice(0,2)   

    case    w_ken       
        when    "01"        
            out01_file.print    line1,"\n"  
        when    "02"        
            out02_file.print    line1,"\n"  
        when    "03"        
            out03_file.print    line1,"\n"  
        when    "04"        
            out04_file.print    line1,"\n"  
        when    "05"        
            out05_file.print    line1,"\n"  
        when    "06"        
            out06_file.print    line1,"\n"  
        when    "07"        
            out07_file.print    line1,"\n"  
        when    "08"        
            out08_file.print    line1,"\n"  
        when    "09"        
            out09_file.print    line1,"\n"  
        when    "10"        
            out10_file.print    line1,"\n"  
        when    "11"        
            out11_file.print    line1,"\n"  
        when    "12"        
            out12_file.print    line1,"\n"  
        when    "13"        
            out13_file.print    line1,"\n"  
        when    "14"        
            out14_file.print    line1,"\n"  
        when    "15"        
            out15_file.print    line1,"\n"  
        when    "16"        
            out16_file.print    line1,"\n"  
        when    "17"        
            out17_file.print    line1,"\n"  
        when    "18"        
            out18_file.print    line1,"\n"  
        when    "19"        
            out19_file.print    line1,"\n"  
        when    "20"        
            out20_file.print    line1,"\n"  
        when    "21"        
            out21_file.print    line1,"\n"  
        when    "22"        
            out22_file.print    line1,"\n"  
        when    "23"        
            out23_file.print    line1,"\n"  
        when    "24"        
            out24_file.print    line1,"\n"  
        when    "25"        
            out25_file.print    line1,"\n"  
        when    "26"        
            out26_file.print    line1,"\n"  
        when    "27"        
            out27_file.print    line1,"\n"  
        when    "28"        
            out28_file.print    line1,"\n"  
        when    "29"        
            out29_file.print    line1,"\n"  
        when    "30"        
            out30_file.print    line1,"\n"  
        when    "31"        
            out31_file.print    line1,"\n"  
        when    "32"        
            out32_file.print    line1,"\n"  
        when    "33"        
            out33_file.print    line1,"\n"  
        when    "34"        
            out34_file.print    line1,"\n"  
        when    "35"        
            out35_file.print    line1,"\n"  
        when    "36"        
            out36_file.print    line1,"\n"  
        when    "37"        
            out37_file.print    line1,"\n"  
        when    "38"        
            out38_file.print    line1,"\n"  
        when    "39"        
            out39_file.print    line1,"\n"  
        when    "40"        
            out40_file.print    line1,"\n"  
        when    "41"        
            out41_file.print    line1,"\n"  
        when    "42"        
            out42_file.print    line1,"\n"  
        when    "43"        
            out43_file.print    line1,"\n"  
        when    "44"        
            out44_file.print    line1,"\n"  
        when    "45"        
            out45_file.print    line1,"\n"  
        when    "46"        
            out46_file.print    line1,"\n"  
        when    "47"        
            out47_file.print    line1,"\n"  
    end     
end     

# ファイルのクローズ    
in1_file.close  
out01_file.close    
out02_file.close    
out03_file.close    
out04_file.close    
out05_file.close    
out06_file.close    
out07_file.close    
out08_file.close    
out09_file.close    
out10_file.close    
out11_file.close    
out12_file.close    
out13_file.close    
out14_file.close    
out15_file.close    
out16_file.close    
out17_file.close    
out18_file.close    
out19_file.close    
out20_file.close    
out21_file.close    
out22_file.close    
out23_file.close    
out24_file.close    
out25_file.close    
out26_file.close    
out27_file.close    
out28_file.close    
out29_file.close    
out30_file.close    
out31_file.close    
out32_file.close    
out33_file.close    
out34_file.close    
out35_file.close    
out36_file.close    
out37_file.close    
out38_file.close    
out39_file.close    
out40_file.close    
out41_file.close    
out42_file.close    
out43_file.close    
out44_file.close    
out45_file.close    
out46_file.close    
out47_file.close    
   
【スクリプトとデータのサンプル】

スクリプト(その1)はこちらにあります。

スクリプト(その2)はこちらにあります。

入力データ(県別の出力ファイル名)のサンプルはこちらにあります。




Copyright (c) 2004-2015 Mitsuo Minagawa, All rights reserved.