デバイスから 18.41M (19300352 bytes) の割り当てに失敗しました。CUDA_ERROR_OUT_OF_MEMORY

2022-02-12 18:35:53

tensorflow training はデバイスから 18.41M (19300352 bytes) を割り当てることに失敗しました。cuda_error_out_of_memory

ubuntu 16.04 に tensorflow-gpu version 1.6.0 をインストールして、jupyter notebook で学習させると failed to allocate 18.41M (19300352 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY ネットでいろいろ調べてみると、GPUリソースの一部が以前のプログラムに取られていて、利用可能なリソースが18.41Mしかないことが判明しました。

通常、私たちは nvidia-smi を使ってGPUの使用状況を確認し、次に sudo kill 'ID' を実行して、GPU を占有するプロセスをシャットダウンします。 CUDA_ERROR_OUT_OF_MEMORY を実行すると、次のコマンドを使用できます。

sudo fuser -v /dev/nvidia* #Find the PID of the GPU resource being used
 sudo kill -9 ***(PID) # Unhog the video memory based on the corresponding PID

sudo fuser -v /dev/nvidia* #Find the PID of the GPU resource being used

の後に実行します。

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.9)#0.9 means that 90% of the GPU resources can be used for training, which can be changed arbitrarily

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

完全に完全にオフになるまでオフになっているかどうか確認し、その後、普通にトレーニングにGPUを使用できるようにします。
トレーニングに使用するGPUリソースの量を設定するには、次のコマンドを使用します。

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.9)#0.9 means that 90% of the GPU resources can be used for training, which can be changed arbitrarily

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

デバイスから 18.41M (19300352 bytes) の割り当てに失敗しました。CUDA_ERROR_OUT_OF_MEMORY

tensorflow training はデバイスから 18.41M (19300352 bytes) を割り当てることに失敗しました。cuda_error_out_of_memory

関連

[解決済み】モジュール 'tensorflow'に属性 'contrib' がない。

[解決済み] ModuleNotFoundError: tensorboard' という名前のモジュールはありません。

[解決済み] Tensorflowにおけるglobal_stepの意味とは？

[解決済み] tf.int64をtf.float32に変換する方法は？

AttributeError: 'list' オブジェクトには 'value' という属性がありません。

AttributeError: モジュール 'tensorflow' には 'placeholder' という属性がありません。

[Untitled] AttributeError: module 'tensorflow' has no attribute 'placeholder' error resolved.

Tensorflowシリーズ：tf.contrib.layers.batch_norm

pycharm using TensorFlow, keras error: modulenotfounderror: no module named tensorflow

tf.convert_to_tensorを使用したときの値のエラーの解決方法

最新

nginxです。[emerg] 0.0.0.0:80 への bind() に失敗しました (98: アドレスは既に使用中です)

htmlページでギリシャ文字を使うには

ピュアhtml+cssでの要素読み込み効果

純粋なhtml + cssで五輪を実現するサンプルコード

ナビゲーションバー・ドロップダウンメニューのHTML+CSSサンプルコード

タイピング効果を実現するピュアhtml+css

htmlの選択ボックスのプレースホルダー作成に関する質問

html css3 伸縮しない画像表示効果

トップナビゲーションバーメニュー作成用HTML+CSS

html+css 実装サイバーパンク風ボタン

おすすめ

[解決済み】Tensorflow: tf.get_variableはどのように動作するのでしょうか？

[解決済み】Tensorflow: tf.expand_dimsはいつ使う？

[解決済み】Tensorflowは、Path変数が設定された状態でインストールされているにもかかわらず、「cudart64_90.dll」を見つけることができません。

[解決済み】TensorFlowで*.pbファイルを使用する方法とその動作は？

Tensorflow protobufのバージョンエラー対策 (AttributeError: 'module' オブジェクトに 'Default' 属性がない)

[解決済み] tf.train.shuffle_batchはどのように動作するのですか？

[解決済み] ImportError: libcudart.so.8.0: 共有オブジェクト・ファイルを開くことができません。そのようなファイルまたはディレクトリがありません

[解決済み] WSL2- $nvidia-smi コマンドが実行されない

AttributeError: モジュール 'tensorflow' には 'placeholder' という属性がないことを解決する。

Tensorflowの実行エラー。tensorflow.contrib'という名前のモジュールがありません。