Pythonです。pandasのiloc, loc, ixの違いと連携について

2022-01-23 23:07:54

Pandasライブラリは非常に強力ですが、スライス操作のiloc, loc, ixについて混乱している人が多いので、このブログではこの3つのうち特にilocとlocの違いとつながりを例を使って説明します。

ixについては、操作がややこしいので、別の記事で説明することにしますブログ ixに特化して詳しく解説しています。

まず、3つの手法の概要を紹介する。

loc を持つ行(または列)を取得します。 ラベル をインデックスから取得します。 インデックス インデックスから 特定タグ を行（または列）に入れてください。ここで重要なのは、「タグ」です。タグは名前nameと理解される。
iloc の行(または列)を取得します。位置をインデックスに追加します(つまり、整数値しか取りません)。 ilocは インデックス の中に 特定の場所 は行 (または列) を取得します (したがって整数値しか受け付けません)。ここで重要なのは、位置である。位置は行の数として理解される。
ix のように振る舞おうとします。 loc のような振る舞いに戻ってしまいます。 iloc ixは通常locのように振る舞おうとしますが、ラベルがインデックスに存在しない場合はilocのように振る舞いに戻ります。(ちょっと複雑な文章なので、理解できなくても大丈夫ですこちら )

次に、いくつかの例を挙げて説明します。

1箇所

実は、LOCでは常に以下の原則を守っています。 loc はラベルに基づいてインデックス化されます!

import pandas as pd
df1 = pd.DataFrame(data= [[1, 2, 3], [4, 5, 6], [7, 8, 9]], index=[0, 1, 2], columns=['a','b','c'])
df2 = pd.DataFrame(data= [[1, 2, 3], [4, 5, 6], [7, 8, 9]], index=['e', 'f', 'g'], columns=['a', 'b', 'c'])
print(df1)
print(df2)
'''
df1:
   a b c
0 1 2 3
1 4 5 6
2 7 8 9
df2:
   a b c
e 1 2 3
f 4 5 6
g 7 8 9
'''

# loc index line, label is an integer number
print(df1.loc[0])
'''
a 1
b 2
c 3
Name: 0, dtype: int64
'''

# loc index line, label is character type
print(df2.loc['e'])
'''
a 1
b 2
c 3
Name: 0, dtype: int64
'''
# If you write this to df2: df2.loc[0] will report an error, because the loc index is label, apparently in the df2 line of the name is not called 0.
print(df2.loc[0])
'''
TypeError: cannot do slice indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [0] of <class 'int'>
'''

# loc indexes multi-row data
print(df1.loc[1:])
'''
   a b c
1 4 5 6
2 7 8 9
'''

# loc index multi-column data
print(df1.loc[:,['a', 'b']])
'''
   a b
0 1 2
1 4 5
2 7 8
'''
# df1.loc[:,0:2] so write an error, because the loc index is label, apparently in the df1 column names are not called 0, 1 and 2.
print(df1.loc[:,0:2])
'''
TypeError: cannot do slice indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [0] of <class 'int'>
'''

# locs index certain rows certain columns
print(df1.loc[0:2, ['a', 'b']])
'''
   a b
0 1 2
1 4 5
2 7 8
'''

2つのiloc

実は、ilocには必ず次のような原理があります。 ilocは位置に基づいてインデックスされます!

import pandas as pd
df1 = pd.DataFrame(data= [[1, 2, 3], [4, 5, 6], [7, 8, 9]], index=[0, 1, 2], columns=['a','b','c'])
df2 = pd.DataFrame(data= [[1, 2, 3], [4, 5, 6], [7, 8, 9]], index=['e', 'f', 'g'], columns=['a', 'b', 'c'])
print(df1)
print(df2)
'''
df1:
   a b c
0 1 2 3
1 4 5 6
2 7 8 9
df2:
   a b c
e 1 2 3
f 4 5 6
g 7 8 9
'''
# iloc index line, label is an integer number
print(df1.iloc[0])
'''
a 1
b 2
c 3
Name: 0, dtype: int64
'''

# iloc index line, label is character type. If the way to write according to the loc should be: df2.iloc['e'], obviously this error, because iloc does not recognize the label, it is based on the location.
print(df2.iloc['e'])
'''
TypeError: cannot do positional indexing on <class 'pandas.core.indexes.base.Index'> with these indexers [e] of <class 'str'>
'''
# iloc index line, label is character type. The correct way to write this would be as follows:
# That is, no matter what type the index is, iloc can only write position, that is, integer numbers.
print(df2.iloc[0])
'''
a 1
b 2
c 3
Name: e, dtype: int64
'''

# iloc index multiple rows of data
print(df1.iloc[1:])
'''
   a b c
1 4 5 6
2 7 8 9
'''

# iloc indexes multi-column data
# If the following way to write, report an error.
print(df1.iloc[:,['a', 'b']])
'''
TypeError: cannot perform reduce with flexible type
'''
# iloc index multi-column data, the correct way to write the following.
print(df1.iloc[:,0:2])
'''
   a b
0 1 2
1 4 5
2 7 8
'''

# iloc index some rows some columns
print(df1.iloc[0:2, 0:1])
'''
   a
0 1
1 4
'''

3 ix

ixの操作はより複雑で、pandasバージョン0.20.0以降ではixは推奨されなくなり、ixの実装にはilocとlocが推奨されています。

ixの使用に興味がある方は、こちらの記事を参考にしてください。ブログ .

Pythonです。pandasのiloc, loc, ixの違いと連携について

1箇所

2つのiloc

3 ix

関連

ValueError: 入力配列を形状 (22500,3) から形状 (1) にブロードキャストできなかった。

PythonがNameError: name '_name_' is not definedのようなエラーを発生させる。

Python: pyHook-1.5.1-cp37-cp37m-win_amd64.whl はこのプラットフォームでサポートされたホイールではありません。

ImportError: pandas という名前のモジュールがない問題が解決される

python 3.3.2 エラー。urllib2' という名前のモジュールがないソリューション

Pythonでナンバープレート自動認識システムを作ろう！楽しくて実用的です。

dict_keys' オブジェクトはインデックス作成ソリューションに対応していません。

プログラム実行中にPythonの例外が発生しました。TypeError: 'NoneType' オブジェクトは呼び出し可能ではありません。

idea create python project report Unresolved reference 'xxx' .... の解決策

AttributeError: 'dict' オブジェクトには 'iteritems' という属性がありません。

最新

nginxです。[emerg] 0.0.0.0:80 への bind() に失敗しました (98: アドレスは既に使用中です)

htmlページでギリシャ文字を使うには

ピュアhtml+cssでの要素読み込み効果

純粋なhtml + cssで五輪を実現するサンプルコード

ナビゲーションバー・ドロップダウンメニューのHTML+CSSサンプルコード

タイピング効果を実現するピュアhtml+css

htmlの選択ボックスのプレースホルダー作成に関する質問

html css3 伸縮しない画像表示効果

トップナビゲーションバーメニュー作成用HTML+CSS

html+css 実装サイバーパンク風ボタン

おすすめ

Python3.3継続行のアンダーインデントで、.の後に複数のスペースを入れて視覚的にインデントしています。

SyntaxError: 構文が無効です。

ImportError: 名前 '_validate_lengths' をインポートできない。

PythonのエラーNo module named 'pkg_resources' を解決する。

Python エラー: int() の引数は文字列、バイトのようなオブジェクト、または数値でなければならず、 'list' ではありません。

Pythonがエラーを報告する AttributeError:'numpy.ndarray' オブジェクトに 'index' という属性がない

Pythonエラー解決] 'urllib2'という名前のモジュールがない解決方法

Pythonのエラーについて。TypeError: += でサポートされていないオペランドタイプ: 'builtin_function_or_method' および 'int' です。

jupyter notebookのアンインストールで "The jupyter" distribution was not found 問題が発生する。

ValueError: 連結軸を除くすべての入力配列の次元が正確に一致する必要があります。