准备工作
- hadoop-2.8.0.tar.gz
- hbase-1.3.1-bin.tar.gz
- zookeeper-3.4.9.tar.gz
- CentOS Linux release 7.6.1810 (Core) * 3
配置ssh密码免登陆
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
|
## 产生密钥
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
## 导入authorized_keys
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
## 将密钥文件拷贝至所有slave机器
scp authorized_keys slave1@slave1:~/.ssh/authorized_keys_from_master
## 进入slave每一台的.ssh目录,如下操作
cat authorized_keys_from_master >> authorized_keys
### 配置每台主机hosts
/etc/hosts
masterIP地址 master
slave1地址 slave1
slave1地址 slave2
### 最后一步很重要,要在master主机上登录一次slave1和slave2
### 因为第一次登录需要验证密码的,后面则不需要
|
配置文件
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
|
### 配置文件在hadoop/etc/hadoop目录下,依次将内容拷贝进去
### core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master123:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/httx/run/hadoop/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.aboutyun.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.aboutyun.groups</name>
<value>*</value>
</property>
</configuration>
### hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/httx/run/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/httx/run/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
### mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master:19888</value>
</property>
</configuration>
### yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8089</value>
</property>
</configuration>
|
core-site.xml 、 hdfs-site.xml一定不要忘记拷贝到hbase/conf目录下
配置jdk
hadoop-env.sh
修改JAVA_HOME值(export JAVA_HOME=/usr/jdk1.7) /httx/run/jdk
yarn-env.sh
修改JAVA_HOME值(export JAVA_HOME=/usr/jdk1.7)
slaves (这个文件里面保存所有slave节点)
slave1
slave2
启动
1
2
3
4
|
### 第一步每一台都要执行
hadoop/bin/hdfs namenode -format
hadoop/sbin/start-dfs.sh
hadoop/sbin/start-yarn.sh
|
zookeeper 安装启动
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
mv zoo_sample.cfg zoo.cfg
zoo.cfg
dataDir=/httx/run/zookeeper/data/data
dataLogDir=/httx/run/zookeeper/data/log
server.1=主机1:2888:3888
server.2=主机2:2888:3888
server.3=主机3:2888:3888
在data里会放置一个myid文件,里面就一个数字,用来唯一标识这个服务。这个id是很重要的,一定要保证整个集群中唯一
ZooKeeper会根据这个id来取出server.x上的配置。比如当前id为1,则对应着zoo.cfg里的server.1的配置
/httx/run/zookeeper/data/data 添加 myid
分别在三台机器上做
echo "1" > /httx/run/zookeeper/data/data/myid
echo "2" > /httx/run/zookeeper/data/data/myid
echo "3" > /httx/run/zookeeper/data/data/myid
每一台启动:cd /httx/run/zookeeper
./bin/zkServer.sh start
|
hbase 安装启动
hbase-env.sh 配置
1
2
3
4
|
export JAVA_HOME=/httx/run/jdk
export HBASE_MANAGES_ZK=false
export HBASE_HEAPSIZE=8000
export HBASE_LOG_DIR=/httx/run/hbase/log
|
hbase-site.xml 配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>zk地址1:2181,zk地址2:2181,zk地址3:2181</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
<property>
<name>zookeeper.session.timeout.ms</name>
<value>400000</value>
</property>
<property>
<name>hbase.hregion.max.filesize</name>
<value>64424509440</value>
</property>
</configuration>
|
配置 regionservers
启动hbase
1
2
|
./bin/start-hbase.sh
./bin/hbase shell
|
按照上面到流程安装成功了,表也创建成功了,但是在put的时候会报错,找了很多资料发现
仅仅是因为linux(centos)的hosts的问题,是由于默认hosts中的local.localdomain。
将下面/etc/hosts文件
1
2
3
4
5
6
7
|
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
主机1 db-apmKVStore-1.100.idc.tf56 db-apmKVStore-1
主机1 master
主机2 slave1
主机3 slave2
|
改为
1
2
3
4
5
6
7
|
#127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
主机1 db-apmKVStore-1.100.idc.tf56 db-apmKVStore-1
主机1 master
主机2 slave1
主机3 slave2
|
将local.localdomain全部去除
在访问hbase的客户端的服务器上记得在/etc/hosts文件中加上,不然会访问不通hbase
1
2
3
|
主机1 master
主机2 slave1
主机3 slave2
|