上海古都建筑设计集团,上海办公室装修设计公司,上海装修公司高质量的内容分享社区,上海装修公司我们不是内容生产者,我们只是上海办公室装修设计公司内容的搬运工平台

大数据平台组件日常运维操作说明(HadoopZookeeperKafaESMysqlSparkFlumeLogstashTomcat)

guduadmin21天前

Hadoop 日常运维操作说明

hdfs

生产环境hadoop为30台服务器组成的集群,统一安装配置,版本号为2.7.7

部署路径:/opt/hadoop

启动用户:hadoop

配置文件:

  • /opt/hadoop/config/hdfs-site.xml
  • /opt/hadoop/config/core-site.xml

    hadoopy运行环境变量配置文件:

    • hadoop-env.sh
    • journalnode.env
    • datanode.env
    • namenode.env

      hadoop系统服务配置文件:

      • zkfc.service
      • journalnode.service
      • namenode.service
      • datanode.service

        存储快照文件snapshot的目录:/data/hadoop/data

        运行日志输出目录:/data/hadoop/logs

        Hadoop运行正常时会有下列端口

        • 50010 HDFS datanode 服务端口,用于数据传输
        • 50075 HDFS namenode http服务的端口
        • 50020 HDFS namenode ipc服务的端口
        • 50070 HDFS namenode http服务的端口,active namenode中启动
        • 8020 HDFS namenode 接收Client连接的RPC端口,用于获取文件系统metadata信息。

          [hadoop@hostname-2 ~]$ netstat -ln|egrep "(50010|50075|50475|50020|50070|50470|8020|8019)"

          tcp        0      0 172.0.0.2:50070      0.0.0.0:*               LISTEN

          tcp        0      0 0.0.0.0:50010           0.0.0.0:*               LISTEN

          tcp        0      0 0.0.0.0:50075           0.0.0.0:*               LISTEN

          tcp        0      0 0.0.0.0:50020           0.0.0.0:*               LISTEN

          Hadoop官方参考文档

          hadoop组件启动与停止命令

          # 启动

          sudo systemctl start namenode.service

          sudo systemctl start datanode.service

          sudo systemctl start journalnode.service

          # 停止

          sudo systemctl stop namenode.servicec

          sudo systemctl stop datanode.servicec

          sudo systemctl stop journalnode.service

          # 查看启动状态

          sudo systemctl status namenode.service

          sudo systemctl status datanode.service

          sudo systemctl status journalnode.service

          # 开机时自动自动启动

          sudo systemctl enable namenode.service

          sudo systemctl enable datanode.service

          sudo systemctl enable journalnode.service

          查看hadoop组件运行状态参数

          # 查看当前namenode节点

          [hadoop@hostname-2 ~]$ hdfs getconf -namenodes

          hostname-3 hostname-2

          # 查看集群datanode节点配置文件

          [hadoop@hostname-2 ~]$ hdfs getconf -includeFile

          /opt/hadoop/config/slaves

          # 查看datanode rpc端口

          [hadoop@hostname-2 ~]$ hdfs getconf -nnRpcAddresses

          hostname-3:9000

          hostname-2:9000

          hdfs getconf -confKey [key]

          # dfsadmin

          [hadoop@hostname-2 ~]$  hdfs dfsadmin -report -live

          Configured Capacity: 422346469376 (393.34 GB)

          Present Capacity: 317439557632 (295.64 GB)

          DFS Remaining: 315510235136 (293.84 GB)

          ...

          -------------------------------------------------

          Live datanodes (3):

          Name: 172.0.0.3:50010 (hostname-3)

          Hostname: iZ8vbacq1jxnabyu7992d1Z

          Decommission Status : Normal

          ...

          Name: 172.0.0.1:50010 (hostname-1)

          Hostname: iZ8vb2s7y1j8fqmqbmufz9Z

          Decommission Status : Normal

          ...

          Name: 172.0.0.2:50010 (iZ8vbacq1jxnabyu7992d2Z)

          Hostname: iZ8vbacq1jxnabyu7992d2Z

          Decommission Status : Normal

          ...

          # haadmin 查看namenode主节点

          [hadoop@hostname-2 ~]$ hdfs haadmin -getServiceState hostname-2

          active

          yarn

          启动用户: hadoop

          配置文件:

          • /opt/hadoop/config/yarn-site.xml

            环境变量文件:

            • yarn.env
            • zkfc.env

              系统服务配置文件:

              • yarn-nm.service
              • yarn-rm.service
              • zkfc.service

                hadoop Yarn组件运行正常时会有下列端口

                • 8030 YARN ResourceManager scheduler组件的IPC端口
                • 8031 YARN ResourceManager RPC
                • 8032 YARN ResourceManager RM的applications manager(ASM)端口
                • 8033 YARN ResourceManager IPC
                • 8088 YARN ResourceManager http服务端口
                • 10020 YARN JobHistory Server IPC
                • 18080 YARN JobHistory Server http服务端口

                  [hadoop@hostname-2 ~]$ netstat -ln|egrep "(8032|8030|8031|8033|8088)"

                  tcp        0      0 172.0.0.2:8088       0.0.0.0:*               LISTEN

                  tcp        0      0 172.0.0.2:8030       0.0.0.0:*               LISTEN

                  tcp        0      0 172.0.0.2:8031       0.0.0.0:*               LISTEN

                  tcp        0      0 172.0.0.2:8032       0.0.0.0:*               LISTEN

                  tcp        0      0 172.0.0.2:8033       0.0.0.0:*               LISTEN

                  [hadoop@hostname-1 ~]$ netstat -ln|egrep "(10020|18080)"

                  tcp        0      0 0.0.0.0:18080           0.0.0.0:*               LISTEN

                  tcp        0      0 0.0.0.0:10020           0.0.0.0:*               LISTEN

                  Yarn官方文档

                  yarn服务启动与停止

                  # 启动

                  sudo systemctl start yarn-rm.service

                  sudo systemctl start yarn-nm.service

                  # 停止

                  sudo systemctl stop yarn-rm.servicec

                  sudo systemctl stop yarn-nm.servicec

                  # 查看启动状态

                  sudo systemctl status yarn-rm.service

                  sudo systemctl status yarn-nm.service

                  # 开机时自动自动启动

                  sudo systemctl enable yarn-rm.service

                  sudo systemctl enable yarn-nm.service

                  yarn状态检查命令

                  [hadoop@hostname-2 ~]$ yarn node -list

                  Total Nodes:2

                           Node-Id             Node-State Node-Http-Address       Number-of-Running-Containers

                  iZ8vbacq1jxnabyu7992d1Z:46719           RUNNING iZ8vbacq1jxnabyu7992d1Z:8042                               0

                  iZ8vb2s7y1j8fqmqbmufz9Z:40138           RUNNING iZ8vb2s7y1j8fqmqbmufz9Z:8042                               0

                  查看某节点状态

                  [hadoop@hostname-2 ~]$ yarn rmadmin -getServiceState hostname-2

                  active

                  请求服务执行运行状况检查。如果检查失败,RMAdmin工具将以非零退出代码退出。

                  [hadoop@hostname-2 ~]$ yarn rmadmin -checkHealth hostname-2 ; echo $?

                  0

                  zookeeper日常运维操作说明

                  生产环境zookeeper为三台服务器组成的集群,统一安装配置,版本号为3.4.14

                  启动用户:logmanager

                  部署路径:/opt/zookeeper

                  配置文件:/opt/zookeeper/conf/zoo.cfg

                  存储快照文件snapshot的目录:/data/zookeeper/data

                  事务日志输出目录:/data/zookeeper/logs

                  运行日志输出目录:/data/zookeeper/logs

                  zookeeper运行正常时会有3个端口,分别为2181,2888,3888。其中

                  • 2181为对外提供服务的端口,每个节点都会启动
                  • 2888为Leader和Follower交互的端口,这个端口仅再leader服务器中启动
                  • 3888为zookeeper组件Leader Election使用的端口,每个节点都会启动

                    [hadoop@hostname-3 ~]$ netstat -ln|egrep "(2181|2888|3888)"

                    tcp        0      0 0.0.0.0:2181         0.0.0.0:*               LISTEN

                    tcp        0      0 172.0.0.3:2888       0.0.0.0:*               LISTEN

                    tcp        0      0 172.0.0.3:3888       0.0.0.0:*               LISTEN

                    zookeeper 启动与停止

                    # 启动

                    sudo systemctl start zookeeper.service

                    # 查看启动状态

                    systemctl status zookeeper.service

                    # 停止

                    sudo systemctl stop zookeeper.service

                    # 服务开机自启动

                    sudo systemctl enable zookeeper.service

                    查看zookeeper节点状态

                    方法1

                    [hadoop@hostname-1 ~]$ /opt/zookeeper/bin/zkServer.sh status

                    ZooKeeper JMX enabled by default

                    Using config: /opt/zookeeper/bin/../conf/zoo.cfg

                    Mode: leader

                    方法2

                    [hadoop@hostname-1 ~]$ echo stat | nc 127.0.0.1 2181

                    Zookeeper version: 3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf, built on 03/06/2019 16:18 GMT

                    Clients:

                     /127.0.0.1:60696[0](queued=0,recved=1,sent=0)

                     /172.0.0.2:53934[1](queued=0,recved=595720,sent=595742)

                     /172.0.0.3:42448[1](queued=0,recved=594837,sent=594837)

                    Latency min/avg/max: 0/0/137

                    Received: 1190603

                    Sent: 1190624

                    Connections: 3

                    Outstanding: 0

                    Zxid: 0x1240000e71a

                    Mode: follower

                    Node count: 229

                    测试是否启动了该Server,若回复imok表示已经启动。

                    [hadoop@hostname-1 ~]$ echo ruok | nc 127.0.0.1 2181

                    imok

                    kafka日常运维操作说明

                    生产环境kafka为三台服务器组成的集群,统一安装配置,版本号为2.11-1.10

                    启动用户:logmanager

                    部署路径:/opt/kafka

                    配置文件:/opt/kafka/config/server.properties

                    存储数据目录:/data/kafka/data

                    日志输出目录:/data/kafka/logs

                    elasticsearch运行正常时会涉及2个端口,分别为2181,9092。其中

                    • 2181为zookeeper提供服务的端口,kafka需要在zookeeper中存储broker和consumer信息。zookeeper记录了所有broker的存货状态,broker会想zookeeper发送心跳请求来上报自己的状态。
                    • 9092为kafka集群间通信地址

                      [hadoop@hostname-3 ~]$ netstat -ln|egrep "(2181|9092)"

                      tcp        0      0 0.0.0.0:2181            0.0.0.0:*               LISTEN

                      tcp        0      0 0.0.0.0:9092            0.0.0.0:*               LISTEN

                      服务启动与停止

                      # 启动

                      sudo systemctl start kafka.service

                      # 查看状态

                      sudo systemctl status kafka.service

                      # 停止

                      sudo systemctl stop kafka.service

                      # 服务开机自启动

                      sudo sysctlctl enable kafka.service

                      查看当前kafka topic列表

                      [hadoop@hostname-1 opt]$ kafka-topics.sh -list --zookeeper 172.0.0.3:2181  

                      EXECUTE_LOG_TOPIC

                      METRICBEAT_LOG

                      ROBOT_MAIN_PROCESS_EXECUTE_MESSAGE

                      __consumer_offsets

                      agent-status

                      flume-sink

                      logmanager-filebeat

                      logmanager-flume

                      logstash-filebeat

                      origin-biz-log

                      查看topic信息

                      [hadoop@hostname-1 opt]$ kafka-topics.sh --zookeeper 172.0.0.2:2181 --topic "agent-status" --describe

                      Topic: agent-status     PartitionCount: 1       ReplicationFactor: 3    Configs:

                              Topic: agent-status     Partition: 0    Leader: 1       Replicas: 1,2,3 Isr: 1,2,3

                      查看指定group信息

                      [hadoop@hostname-1 opt]$ ./kafka-consumer-groups.sh --new-consumer --bootstrap-server 192.168.52.131:9092 --group test2 --describe

                      查看版本

                      [hadoop@hostname-1 opt]$ find ./libs/ -name \*kafka_\* | head -1 | grep -o '\kafka [^\n]*'

                      查询集群描述

                      [hadoop@hostname-1 opt]$ bin/kafka-topics.sh --describe --zookeeper 127.0.0.1:2181

                      elasticsearch日常运维操作说明

                      生产环境elasticsearch为三台服务器组成的集群,统一安装配置,版本号为5.4.3

                      启动用户:logmanager

                      部署路径:/opt/elasticsearch

                      配置文件:/opt/elasticsearch/config/elasticsearch.yml

                      存储数据目录:/data/elasticsearch/data

                      日志输出目录:/data/elasticsearch/logs

                      elasticsearch运行正常时会有3个端口,分别为9200,9300。其中

                      • 9200为对外提供服务的端口,每个节点都会启动
                      • 9300为Leader和Follower交互的端口,这个端口仅再leader节点中启动

                        [hadoop@hostname-3 ~]$ netstat -ln|egrep "(9300|9200)"

                        tcp        0      0 0.0.0.0:9200            0.0.0.0:*               LISTEN

                        tcp        0      0 0.0.0.0:9300            0.0.0.0:*               LISTEN

                        服务 启动与停止

                        # 启动

                        sudo systemctl start elasticsearch.service

                        # 查看启动状态

                        sudo systemctl status elasticsearch.service

                        # 停止

                        sudo systemctl stop elasticsearch.service

                        # 服务开机自启动

                        sudo systemctl enable elasticsearch.service

                        查看es版本信息

                        [user@hostname-1 ~]$ curl -XGET localhost:9200

                        {

                          "name" : "hostname-1",

                          "cluster_name" : "elastic-cyclone",

                          "cluster_uuid" : "-OfufJGMQfylFBm34d0SKg",

                          "version" : {

                            "number" : "5.4.3",

                            "build_hash" : "eed30a8",

                            "build_date" : "2017-06-22T00:34:03.743Z",

                            "build_snapshot" : false,

                            "lucene_version" : "6.5.1"

                          },

                          "tagline" : "You Know, for Search"

                        }

                        操作命令

                        调整副本数: `curl -XPUT http://localhost/yunxiaobai/_settings?pretty -d ‘{“settings”:{“index”:{“number_of_replicas”:”10″}}}’`

                        创建index:`curl -XPUT ‘localhost:9200/yunxiaobai?pretty’`

                        插入数据:`curl -XPUT ‘localhost:9200/yunxiaobai/external/1?pretty’ -d ‘ { “name”:”yunxiaobai” }’`

                        获取数据:`curl -XGET ‘localhost:9200/yunxiaobai/external/1?pretty’`

                        删除索引:`curl -XDELETE ‘localhost:9200/jiaozhenqing?pretty’`

                        屏蔽节点:`curl -XPUT 127.0.0.1:9200/_cluster/settings?pretty -d ‘{ “transient” :{“cluster.routing.allocation.exclude._ip” : “10.0.0.1”}}’`

                        删除模板:`curl -XDELETE http://127.0.0.1:9200/_template/metricbeat-6.2.4`

                        调整shard刷新时间:`curl -XPUT http://localhost:9200/metricbeat-6.2.4-2018.05.21/_settings?pretty -d ‘{“settings”:{“index”:{“refresh_interval”:”30s”} }}’`

                        提交模板配置文件:`curl -XPUT localhost:9200/_template/devops-logstore-template -d @devops-logstore.json`

                        查询模板: `curl -XGET localhost:9200/_template/devops-logstor-template`

                        查询线程池:http://localhost:9200/_cat/thread_pool/bulk?v&h=ip,name,active,rejected,completed

                        mysql日常运维操作说明

                        生产环境mysql为三台服务器组成的集群,统一安装配置,版本号为5.7

                        启动用户:mysql

                        部署路径:/usr/share/mysql

                        配置文件:/etc/my.cnf

                        存储数据目录:/var/lib/mysql/mysql

                        访问日志路径:/var/log/mysqld.log

                        二进制日志路径:/var/lib/mysql

                        Mysql运行正常时会有1个端口,为3306

                        • 3306为mysql对外提供服务的端口

                          [hadoop@hostname-3 ~]$ netstat -ln|egrep 3306

                          tcp6       0      0 :::3306                 :::*                    LISTEN

                          服务启动与停止

                          # 启动

                          sudo systemctl start mysqld.service

                          # 查看状态

                          sudo systemctl status mysqld.service

                          # 停止

                          sudo systemctl stop mysqld.service

                          # 服务开机自启动

                          sudo systemctl enable mysqld.service

                          测试登录mysql并查看数据库信息

                          mysql -u 用户名 -p

                          # 输入密码

                          mysql> show databases;

                          +--------------------+

                          | Database           |

                          +--------------------+

                          | information_schema |

                          | mysql              |

                          | performance_schema |

                          | sys                |

                          | test               |

                          +--------------------+

                          spark日常运维操作说明

                          生产环境spark为为sprak on yarn的形式部署,spark中运行的任务由yarn调度,spark historyjob服务需要独立配置,historyjob启动是工作在18080端口。启动时spark版本号为1.7.0

                          启动用户:hadoop

                          部署路径:/opt/spark

                          配置文件:/opt/spark/conf/spark-conf.properties

                          存储数据目录:/dat/spark/data

                          访问日志路径:/data/spark/logs

                          Spark运行正常时会有2个端口,为18080和8088

                          • 18080为spark history server对外提供服务的端口,可用于查看历史任务记录

                            # 启动

                            systemctl start spark-history.service

                            # 开机自启动

                            systemctl enable spark-history.service

                            # 停止

                            systemctl stop spark-history.service

                            # 查看状态

                            systemctl status spark-history.service

                            flume日常运维操作说明

                            生产环境flume为多个节点独立运行,在需要运行的服务器上部署,独立安装配置,版本号为1.7.0

                            启动用户:logmanager

                            部署路径:/opt/flume

                            配置文件:/opt/flume/conf/flume-conf.properties

                            存储数据目录:/dat/flume/data

                            访问日志路径:/data/flume/logs

                            flume运行正常时会有1个端口,为4541

                            • 4541为flume对外提供服务的端口

                              [hadoop@hostname-2 ~]$ netstat -ln|egrep 4541

                              tcp        0      0 172.0.0.2:4541       0.0.0.0:*               LISTEN

                              服务启动和停止

                              # 启动

                              sudo systemctl start flume.service

                              # 查看运行状态

                              sudo systemctl status flume.service

                              # 停止

                              sudo systemctl stop flume.service

                              # 服务开机自启动

                              sudo cyctemctl enable flume.service

                              查看master节点端口

                              [hadoop@hostname-2 ~]$ sudo netstat -lntp |grep 4541

                              tcp        0      0 172.0.0.2:4541       0.0.0.0:*               LISTEN  7774/java

                              查看队列内消息

                              查看队列内消息可安装kafka-tools,dump部分topic数据查看内容。

                              logstash日常运维操作说明

                              生产环境logstash为三台服务器组成的集群,统一安装配置,版本号为2.4.1

                              启动用户:logmanager

                              部署路径:/opt/logstash

                              配置文件:/opt/logstash/conf/logstash.yml

                              存储数据目录:/dat/logstash/data

                              访问日志路径:/data/logstash/logs

                              logstash运行正常时会有2个端口,为5044和9600

                              • 5044为logstash对外提供服务的端口,用于接收数据
                              • 9600端口用于获取logstash基本信息

                                [hadoop@hostname-2 ~]$ netstat -ln|egrep "(9600|5044)"

                                tcp        0      0 0.0.0.0:5044            0.0.0.0:*               LISTEN

                                tcp        0      0 127.0.0.1:9600          0.0.0.0:*               LISTEN

                                服务启动和停止

                                # 启动

                                sudo systemctl start logstash.service

                                # 查看运行状态

                                sudo systemctl status logstash.service

                                # 停止

                                sudo systemctl stop logstash.service

                                # 服务开机自启动

                                sudo systemctl enable logstash.service

                                9600端口用于获取logstash基本信息

                                [hadoop@hostname-2 ~]$ curl -XGET 'localhost:9600/?pretty'

                                {

                                  "host" : "iZ8vbacq1jxnabyu7992d2Z",

                                  "version" : "7.7.0",

                                  "http_address" : "127.0.0.1:9600",

                                  "id" : "c9662897-7c12-4eb3-a92c-772da4536730",

                                  "name" : "logmanager",

                                  "ephemeral_id" : "99c86cbf-182a-46c5-9cc9-05f5bd13075b",

                                  "status" : "green",

                                  "snapshot" : false,

                                  "pipeline" : {

                                    "workers" : 8,

                                    "batch_size" : 125,

                                    "batch_delay" : 50

                                  },

                                  "build_date" : "2020-05-12T04:34:14+00:00",

                                  "build_sha" : "d8ed01157be10d78e9910f1fb21b137c5d25529e",

                                  "build_snapshot" : false

                                }

                                Tomcat日常运维操作说明

                                生产环境tomcat为单节点,可通过负载均衡实现集群,版本号为8.5.60

                                启动用户:logmanager

                                部署路径:/opt/tomcat

                                配置文件:/opt/tomcat/conf/server.xml

                                存储数据目录:/dat/tomcat/data

                                访问日志路径:/data/tomcat/logs

                                logstash运行正常时会有2个端口,为8009和8080或8761

                                • 8009为tomcat控制台端口
                                • 8080为tomcat提供web服务端口
                                • 8761为spring注册中心euraka服务端口

                                  [hadoop@hostname-1 conf]$ netstat -ln|egrep "(8009|8080)"

                                  tcp        0      0 0.0.0.0:8009            0.0.0.0:*               LISTEN     

                                  tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN

                                  服务启动和停止

                                  # 启动

                                  sudo systemctl start tomcat.service

                                  # 查看运行状态

                                  sudo systemctl status tomcat.service

                                  # 停止

                                  sudo systemctl stop tomcat.service

                                  # 服务开机自启动

                                  sudo systemctl enable tomcat.service

网友评论

搜索
最新文章
热门文章
热门标签