tensorflow2.0入門例 II (データの前処理)

2022-02-24 04:35:42

この記事では、fcn32sとfcn8sとdeeplab v3+のハンズオン例を走らせました。この例のデータセットは、自律走行関連のコンペティションのkittiデータセットから選ばれました。v3+は97%の精度を達成することができます。本記事は入門者向けであり、fcnの構造は記事に記載せず、Baiduで直接検索できるようにしています。
この記事では、kerasを統合したtensorflow2.0フレームワークを使用しており、モデル学習が非常に簡潔で、tf1.xほど複雑ではない。他の深層学習フレームワークを総合すると、これが初心者に最も適したフレームワークであることがわかる。
記事中で使用したライブラリ関数、パラメータなどはテンソルフロー2.0API を記事内で紹介しています。
記事のコードはgithub（https://github.com/fengshilin/tf2.0-FCN）にて公開しています。

記事の構成は以下の通りです。

データのダウンロードと解析
データ前処理（ラベルの前処理に重点を置く）
モデルの読み込み
モデル構築 (FCN with Deeplab)
モデルの学習とテスト

2. データの前処理（ラベルの前処理に重点を置く）

テンソルフローモデルの入力次元は[batch, h, w, c]で、それぞれバッチ、画像の縦、横、チャンネル数を表しています。batch(バッチで同時にいくつのサンプルを学習するか)を追加した理由は、pythonの演算がループを行列にすることでより高速になるからです。詳しくはEnda Wuのこのレッスン .
セマンティックセグメンテーションでは、データセットの学習セットに画像とラベルが含まれる。一般に画像は3チャンネル画像であり、ラベルは3チャンネル画像または1チャンネルの予測値のどちらかの形式である。

ラベルの前処理

モデルの入力画像は float32 形式のテンソルに変換される必要がある。 ラベルは、画素値がどのクラスに属するかを示す0, 1, 2に変換する必要があります。
学習後のモデルの出力は各カテゴリの確率であり、例えば3つのカテゴリでは、出力は[0.1, 0.2, 0.7] であり、これは第1カテゴリに属する確率が0.1、第2カテゴリに属する確率が0.2、第3カテゴリに属する確率が0.7という意味なので、学習中にモデルが出力を 0,1,2 に近づけてくれるようラベルはピクセル値をカテゴリ値 0,1,2 に変換してくれる必要があります。
注）上記のラベルの前処理は、損失関数の選択と関連しており、以下に2つの例を示す。

ソフトマックスクロスエントロピーのロジット版 batch, h,w,n_class]-dimensional の入力ラベル、すなわち、画素は2番目のクラスに属することを示す値 [0,1,0] を取る。
sparse_categorical_crossentropy batch, h,w]-dimensional の入力ラベル、すなわち、画素は0という値をとり、その画素点が最初のカテゴリに属することを示す。

kittiのデータセットでは、大きく分けて背景と道路の2つのカテゴリがあり、その中から ソフトマックス・クロスエントロピーのロジット付き を損失関数とするため、ラベルの画素値を [0, 1] または [1, 0] に変換する必要があります。現在の画素値が背景の場合、画素値は [1, 0] となり、現在の画素値が道路の場合、値は [0,1] となります。

最初のステップ

さんを画像とラベルのリストにしました。

import tensorflow as tf
import cv2
import os
import scipy
import numpy as np

train_dir = os.path.join( "data", "train", "img")+"/" # os.path.join is doing the join character, equivalent to data/train/img
train_label_dir = os.path.join( "data", "train", "label")+"/"

train_list_dir = os.listdir(train_dir) # List the names of the images in the train-image directory, in the form ["um_000001.png", ...]
train_list_dir.sort() # sort is to make train_list correspond to train_label_list one by one

train_label_list_dir = os.listdir(train_label_dir) # list the names of the images in the train-label directory, in the form ["um_road_000001.png", ...]
train_label_list_dir.sort()

assert len(train_list_dir)==len(train_label_list_dir), "The number of trained images and labels does not match"

train_filenames = [train_dir + filename for filename in train_list_dir] # Generate image path, shaped as ["data/train/img/um_000001.png"]
train_label_filenames = [train_label_dir +
                         filename for filename in train_label_list_dir] # Generate the label path, shaped as ["data/train/img/um_road_000001.png"]

ステップ2

イテレータを生成するのは、学習用モデルの入力として直接使用されるデータセットをtfで生成するためです。
データに対して前処理を行うには、以下のことに注意する必要がある。
1. データセットの画像はすべてサイズが不揃いなので、一定のサイズ（160, 576）にリサイズする必要があります。fcnは縦横を元画像の1/32に縮小してアップサンプリングを行うので、縦横は32の倍数にしておくとよい。
2. 画像に前処理を行うことは任意であり、必ずしもコード通りに行う必要はない。
3. ラベルは、まず背景色と等しい点をTrueとし、Falseと連結して[True, False]のような結果を得る。つまり、ある点のピクセル値が背景色であれば、その点の値を [True, False] に設定し、背景色でなければ [False, True] に設定しています。興味があれば、自分で各ステップの値をプリントアウトしてみましょう。

def train_generator():
    """Training set generator"""
    # Pack the previously generated path list with zip to generate a list shaped like [("data/train/img/um_000001.png", "data/train/img/um_road_000001.png"), ....] and then iterate through the list and take out the corresponding images and labels one by one.
    for train_file_name, train_label_filename in zip(train_filenames, train_label_filenames):
        image, label = handle_data(train_file_name, train_label_filename)
        
		# Here the first time image, label = "data/train/img/um_000001.png", "data/train/img/um_road_000001.png", the second time iterate backwards. Return with yield, which is how python's generator is used
        yield image, label


def test_generator():
    """test_set_generator"""
    for test_filename in test_filenames:
        image = handle_data(test_filename)

        yield image


def handle_data(train_filenames, train_label_filenames=None):
    """do handle_data"""
    image = scipy.misc.imresize(
        scipy.misc.imread(train_filenames), image_shape) # Because the size of the data are different, so we need to resize to our agreed size (160, 576)
    
    # The processing of the image to remove shadows, this step can be left out, it will just be less effective.
    image_yuv = cv2.cvtColor(image, cv2.COLOR_RGB2YUV)
    image_yuv[:, :, 0] = cv2.equalizeHist(image_yuv[:, :, 0])
    image = cv2.cvtColor(image_yuv, cv2.COLOR_YUV2RGB)

	Do the processing on the label
    if train_label_filenames is not None:
        gt_image = scipy.misc.imresize(
            scipy.misc.imread(train_label_filenames), image_shape)
        
        background_color = np.array([255, 0, 0])
        gt_bg = np.all(gt_image == background_color, axis=2)
        gt_bg = gt_bg.reshape(*gt_bg.shape, 1)
        gt_image = np.concatenate((gt_bg, np.invert(gt_bg)), axis=2)
    
        return np.array(image), gt_image
    else:
        return np.array(image)

ステップ3

データセットを生成します。ここではtf.data.Datasetライブラリのfrom_generatorメソッドを使用しますが、from_generator以外にもデータセットを生成するメソッドは多数存在します。
データセット生成後、map, shauffle, batch などでデータセットをマッピングすることができます。
From_Generator のパラメータは

トレーニングジェネレータ : 上記で生成されたトレーニングセットの生成器。
(tf.float32, tf.float32) : フィールドの2つの要素は float32 型であり、フィールドの要素が1つだけの場合は tf.float32 型である。
(tf.TensorShape([なし、なし、なし]), tf.TensorShape([None, None, None])) : は yield の2つの要素の形状を示し、None は不明を意味します。出力形状がわからない場合はNoneを、1つの要素しかyieldしない場合は1つだけ書けばよい。

train_dataset = tf.data.Dataset.from_generator(
    train_generator, (tf.float32, tf.float32), (tf.TensorShape([None, None, None]), tf.TensorShape([None, None, None])))

train_dataset = train_dataset.shuffle(buffer_size=len(train_filenames)) # disrupt the order of the data, i.e., do not read them in the order of the previous sort
train_dataset = train_dataset.batch(batch_size) # Set the batch while training the amount of data. tensorflow model input dimensions are [batcn, h, w, c]

この時点でデータの前処理は完了し、後はモデルの参照と微調整、そしてモデルの保存と読み込みを行う。

tensorflow2.0入門例 II (データの前処理)

2. データの前処理（ラベルの前処理に重点を置く）

ラベルの前処理

最初のステップ

ステップ2

ステップ3

関連

[解決済み] [Solved] Stripping \rn from a line

[解決済み】str型オブジェクトの不明なフォーマットコード'f' - Folium

[解決済み] np.load()はopenと一緒にする必要があります。

[解決済み] python 3.6 ソケットの pickle データが切り捨てられた

[解決済み] pythonのグローバル変数の警告 [重複] [重複

[解決済み] scikit-learnのcross_val_predictの精度スコアはどのように計算されるのですか？

[解決済み] mkstemp() ファイルへの Python 書き込み

[解決済み] Pandasのデータフレームです。ValueError: num は 1 <= num <= 0 でなければならず、1 ではありません。

ModuleNotFoundError: 'pandas'という名前のモジュールがない Solution

python reports UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 4: ordinal not in range.

最新

nginxです。[emerg] 0.0.0.0:80 への bind() に失敗しました (98: アドレスは既に使用中です)

htmlページでギリシャ文字を使うには

ピュアhtml+cssでの要素読み込み効果

純粋なhtml + cssで五輪を実現するサンプルコード

ナビゲーションバー・ドロップダウンメニューのHTML+CSSサンプルコード

タイピング効果を実現するピュアhtml+css

htmlの選択ボックスのプレースホルダー作成に関する質問

html css3 伸縮しない画像表示効果

トップナビゲーションバーメニュー作成用HTML+CSS

html+css 実装サイバーパンク風ボタン

おすすめ

[解決済み】「WindowsError: [エラー2]システムが指定されたファイルを見つけることができません」が解決されない

[解決済み】import input_data MNIST tensorflowが動かない。

POSTの引数'_xsrf'が見つからない場合の対処法

[解決済み] RuntimeError: モジュールは API バージョン 0xc に対してコンパイルされたが、numpy のこのバージョンは 0xb である。

[解決済み] ipython のすべての警告を隠す

[解決済み] Pythonで末尾のゼロを削除する [重複]。

[解決済み] SqlAlchemy Python マルチデータベース

[解決済み] Python の basemap モジュールがインポートできない

django create project report error ImportError: dajango.core.management という名前のモジュールがありません。

Pythonでクリスマスツリー、桜の木、漫画の絵柄を描き、exeファイルにパッケージングする。