沧海拾珠

Pandas 连接数据集

1. 连接两个dataframe数据集

1
2
3
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
df_1 = DataFrame(np.arange(16).reshape(4,4))
0 1 2 3
0 0 1 2 3
1 4 5 6 7
2 8 9 10 11
3 12 13 14 15
df_2 = DataFrame(np.arange(4).reshape(2,2))
0 1
0 0 1
1 2 3
# 将df_1和df_2连接起来
pd.concat([df_1,df_2],axis = 1) #在列上进行连接。pd.concat([df_1,df_2])表示在行上进行连接
0 1 2 3 0 1
0 0 1 2 3 0.0 1.0
1 4 5 6 7 2.0 3.0
2 8 9 10 11 NaN NaN
3 12 13 14 15 NaN NaN
# 另外一种连接方法
df_1.append(df_2,ignore_index = True)

2.将dataframe 和series连接起来

1
2
3
4
5
6
7
8
9
series = Series([3,4,5,6])
series.name = 'added_series'
pd.concat([df_1,series],axis = 1)
0 1 2 3 added_series
0 0 1 2 3 3
1 4 5 6 7 4
2 8 9 10 11 5
3 12 13 14 15 6

3.去掉指定行或者列

1
2
3
4
5
6
7
8
9
10
11
df_1.drop([0,1])
0 1 2 3
2 8 9 10 11
3 12 13 14 15
df_1.drop([0,1],axis = 1)
2 3
0 2 3
1 6 7
2 10 11
3 14 15