Series.drop
方法可以返回一个新对象,移除指定的 index
labels.
import pandas as pd
import numpy as np
s = pd.Series(np.arange(5.), index=['a', 'b', 'c', 'd', 'e'])
s
"""
a 0.0
b 1.0
c 2.0
d 3.0
e 4.0
dtype: float64
"""
s2 = s.drop('c')
s2
"""
a 0.0
b 1.0
d 3.0
e 4.0
dtype: float64
"""
s.drop(['d', 'c'])
"""
a 0.0
b 1.0
e 4.0
dtype: float64
"""
对于 DataFrame,我们可以通过指定 axis
参数来决定从哪个轴进行 drop。
df = pd.DataFrame(np.arange(16).reshape((4, 4)),
index=['BeiJ', 'ShangH', 'ShenZ', 'GuangZ'],
columns=['one', 'two', 'three', 'four'])
print(df)
"""
one two three four
BeiJ 0 1 2 3
ShangH 4 5 6 7
ShenZ 8 9 10 11
GuangZ 12 13 14 15
"""
只传入一个序列,则丢弃的是 index
:
df.drop(['BeiJ', 'ShenZ'])
"""
one two three four
ShangH 4 5 6 7
GuangZ 12 13 14 15
"""
df.drop(['BeiJ', 'ShenZ'])
其实是 df.drop(index=['BeiJ', 'ShenZ'])
将 axis
参数设为 1 或者 ‘columns’,则会丢弃 column
:
df.drop(['one', 'three'], axis=1)
"""
two four
BeiJ 1 3
ShangH 5 7
ShenZ 9 11
GuangZ 13 15
"""
以上语句和下面的表达是等价的:
df.drop(columns=['one', 'three'])
"""
two four
BeiJ 1 3
ShangH 5 7
ShenZ 9 11
GuangZ 13 15
"""
如果我们不想要创建新的对象,而是直接在原对象基础之上修改,则可以指定 inplace=True
:
df.drop(['BeiJ', 'ShenZ'], inplace=True)
print(df)
"""
one two three four
ShangH 4 5 6 7
GuangZ 12 13 14 15
"""
[1] NumPy Reference. https://numpy.org/doc/stable/reference/index.html
[2] Python for Data Analysis, 2
n
d
^{\rm nd}
nd edition. Wes McKinney.