从零到一:SkyWalking 9.x 与 Elasticsearch 8.x 生产环境部署实战
1. 环境准备与组件下载
在开始部署之前,我们需要确保服务器环境满足基本要求。我推荐使用CentOS 7或Ubuntu 20.04 LTS这类稳定的Linux发行版,实测下来这两个系统对SkyWalking和Elasticsearch的兼容性最好。服务器配置方面,生产环境建议至少4核CPU、8GB内存和100GB存储空间,特别是Elasticsearch对内存需求较高。
先安装必要的依赖项:
# CentOS系统 sudo yum install -y java-11-openjdk-devel wget unzip # Ubuntu系统 sudo apt-get update sudo apt-get install -y openjdk-11-jdk wget unzip验证Java环境是否安装成功:
java -version # 应该显示类似:openjdk version "11.0.12"接下来下载所需组件。这里有个坑我踩过:不同版本组合可能存在兼容性问题。经过多次测试,我确认以下组合最稳定:
- SkyWalking 9.5.0
- Elasticsearch 8.5.3
下载地址(建议使用官方镜像):
wget https://archive.apache.org/dist/skywalking/9.5.0/apache-skywalking-apm-9.5.0.tar.gz wget https://archive.apache.org/dist/skywalking/9.5.0/apache-skywalking-java-agent-9.5.0.tgz wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.5.3-linux-x86_64.tar.gz2. Elasticsearch 8.x 部署实战
2.1 安装与基础配置
解压文件到指定目录:
mkdir -p /opt/elasticsearch tar -xzf elasticsearch-8.5.3-linux-x86_64.tar.gz -C /opt/elasticsearch创建专用用户(Elasticsearch不允许用root运行):
useradd -M -s /bin/false elasticsearch chown -R elasticsearch:elasticsearch /opt/elasticsearch修改关键配置(/opt/elasticsearch/elasticsearch-8.5.3/config/elasticsearch.yml):
cluster.name: skywalking-cluster node.name: node-1 path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch network.host: 0.0.0.0 discovery.type: single-node xpack.security.enabled: false # 生产环境建议开启,这里为简化先关闭2.2 系统调优与启动
调整系统限制(/etc/security/limits.conf):
elasticsearch soft nofile 65535 elasticsearch hard nofile 65535 elasticsearch soft memlock unlimited elasticsearch hard memlock unlimited配置systemd服务(/usr/lib/systemd/system/elasticsearch.service):
[Unit] Description=Elasticsearch After=network.target [Service] User=elasticsearch Group=elasticsearch Environment="ES_HOME=/opt/elasticsearch/elasticsearch-8.5.3" Environment="ES_PATH_CONF=/opt/elasticsearch/elasticsearch-8.5.3/config" ExecStart=/opt/elasticsearch/elasticsearch-8.5.3/bin/elasticsearch LimitNOFILE=65535 LimitMEMLOCK=infinity [Install] WantedBy=multi-user.target启动服务并验证:
systemctl daemon-reload systemctl start elasticsearch systemctl enable elasticsearch # 验证是否运行成功 curl -X GET "localhost:9200/" # 应该返回包含"you Know, for Search"的JSON3. SkyWalking 9.x 核心组件部署
3.1 OAP服务配置
解压SkyWalking安装包:
mkdir -p /opt/skywalking tar -xzf apache-skywalking-apm-9.5.0.tar.gz -C /opt/skywalking关键配置修改(/opt/skywalking/apache-skywalking-apm-bin/config/application.yml):
storage: selector: ${SW_STORAGE:elasticsearch} elasticsearch: namespace: ${SW_NAMESPACE:"sw_index"} clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200} protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:"http"} user: ${SW_ES_USER:""} password: ${SW_ES_PASSWORD:""} dayStep: ${SW_STORAGE_DAY_STEP:1} # 存储索引按天分割 indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:1} # 分片数 indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:1} # 副本数3.2 Web UI配置
修改Web应用端口(/opt/skywalking/apache-skywalking-apm-bin/webapp/webapp.yml):
server: port: 12880 collector: path: /graphql ribbon: ReadTimeout: 10000 listOfServers: 127.0.0.1:128003.3 服务启动与管理
创建启动脚本(/opt/skywalking/start_sw.sh):
#!/bin/bash # 启动OAP服务 nohup /opt/skywalking/apache-skywalking-apm-bin/bin/oapService.sh > /dev/null 2>&1 & # 启动Web UI nohup /opt/skywalking/apache-skywalking-apm-bin/bin/webappService.sh > /dev/null 2>&1 &设置开机自启(/etc/systemd/system/skywalking.service):
[Unit] Description=SkyWalking Service After=network.target elasticsearch.service [Service] Type=forking ExecStart=/opt/skywalking/start_sw.sh ExecStop=/bin/kill -TERM $MAINPID [Install] WantedBy=multi-user.target验证服务状态:
# 检查OAP日志 tail -f /opt/skywalking/apache-skywalking-apm-bin/logs/skywalking-oap-server.log # 应该看到"Storage ElasticSearch client is connected" # 访问Web UI curl -I http://localhost:12880 # 应返回HTTP 2004. 探针配置与实战技巧
4.1 Java Agent部署
解压探针包:
tar -xzf apache-skywalking-java-agent-9.5.0.tgz -C /opt/skywalking典型Java应用启动参数示例:
java -javaagent:/opt/skywalking/skywalking-agent/skywalking-agent.jar \ -Dskywalking.agent.service_name=your_service_name \ -Dskywalking.collector.backend_service=localhost:11800 \ -jar your_application.jar4.2 高级配置技巧
探针配置文件(/opt/skywalking/skywalking-agent/config/agent.config)关键参数:
# 服务名称 agent.service_name=${SW_AGENT_NAME:Your_Application} # 采样率(生产环境建议0.1-0.3) agent.sample_n_per_3_secs=${SW_AGENT_SAMPLE:-1} # 忽略特定请求 agent.ignore_suffix=${SW_AGENT_IGNORE_SUFFIX:.jpg,.jpeg,.png,.css,.js,.gif} # 跨进程传播配置 agent.cross_process_propagation=${SW_AGENT_CROSS_PROCESS:true}4.3 常见问题排查
如果遇到数据不显示问题,按以下步骤检查:
- 确认OAP服务日志没有错误
- 检查Elasticsearch索引是否创建成功
curl -X GET "localhost:9200/_cat/indices?v" - 验证探针连接状态
netstat -tulnp | grep 11800 - 调整日志级别(/opt/skywalking/apache-skywalking-apm-bin/config/log4j2.xml)
<Root level="DEBUG"/>
5. 生产环境优化建议
5.1 性能调优参数
Elasticsearch优化(config/jvm.options):
-Xms4g # 初始堆内存 -Xmx4g # 最大堆内存(不超过物理内存50%) -XX:+UseG1GCSkyWalking OAP内存调整(bin/oapService.sh):
export JAVA_OPTS="-Xms2g -Xmx2g -XX:+UseG1GC"5.2 安全配置
启用Elasticsearch安全特性:
xpack.security.enabled: true xpack.security.transport.ssl.enabled: true生成密码并配置SkyWalking:
# 生成密码 /opt/elasticsearch/elasticsearch-8.5.3/bin/elasticsearch-reset-password -u elastic # 修改SkyWalking配置 storage: elasticsearch: user: elastic password: your_strong_password5.3 监控与维护
设置定期清理任务(crontab -e):
# 每天凌晨清理7天前索引 0 0 * * * curl -X DELETE "http://localhost:9200/sw_index-$(date -d '-7 days' +'%Y%m%d')"配置告警规则(/opt/skywalking/apache-skywalking-apm-bin/config/alarm-settings.yml):
rules: service_resp_time_rule: metrics-name: service_resp_time op: ">" threshold: 1000 period: 10 count: 3 silence-period: 5 message: Response time of {name} is more than 1s实际部署中我发现,SkyWalking 9.x与Elasticsearch 8.x的组合在稳定性上有显著提升,但版本匹配非常关键。曾经因为混用8.x和9.x组件导致数据不一致问题,花费了大量时间排查。建议严格按照本文推荐的版本组合部署,可以避免90%的兼容性问题。
