⑧SpringCloud 实战：引入 Actuator监控+整合Grafana监控页面

Actuator是什么？

Spring Boot Actuator 模块提供了生产级别的功能，比如健康检查，审计，指标收集，HTTP 跟踪等，帮助我们监控和管理Spring Boot 应用。这个模块是一个采集应用内部信息暴露给外部的模块，上述的功能都可以通过HTTP 和 JMX 访问。

因为暴露内部信息的特性，Actuator 也可以和一些外部的应用监控系统整合（Prometheus, Graphite, DataDog, Influx, Wavefront, New Relic等）。这些监控系统提供了出色的仪表板，图形，分析和警报，可帮助你通过一个统一友好的界面，监视和管理你的应用程序。

Actuator使用Micrometer与这些外部应用程序监视系统集成。这样一来，只需很少的配置即可轻松集成外部的监控系统。

Actuator 使用

引入依赖

我们新建一个项目：jlw-actuator

引入 spring-boot-starter-actuator 的Maven依赖

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

启动项目，访问：/actuator/health，理论上页面会输出以下成功的信息。
```
{
    "status": "UP"
}
```

Endpoints

Spring Boot 提供了所谓的 endpoints （端点）给外部来与应用程序进行访问和交互，允许添加自己的端点。可以启用或禁用每个端点，并通过HTTP或JMX公开（可远程访问）。默认暴露的两个端点为/actuator/health和 /actuator/info。

暴露端点

端点可能包含敏感信息，所以应该仔细考虑何时公开它们。

下面的是默认暴露配置：

management.endpoints.jmx.exposure.exclude = *
management.endpoints.jmx.exposure.include = *

management.endpoints.web.exposure.exclude = *
management.endpoints.web.exposure.include = info, health

暴露除/env，/beans之外的端点：

management:
  endpoints:
    web:
      exposure:
        include: "*"
        exclude: "env,beans"

禁用默认其他节点

下面的意思是默认不启用所有端点，仅启用/info端点

management:
  endpoints:
    enabled-by-default: false
  endpoint:
    info:
      enabled: true

修改前缀/actuator

在配置文件添加以下配置，自定义base-path属性

management:
  endpoints:
    web:
      base-path: /jinglingwang

访问地址更改为：/jinglingwang/health

健康信息

/health端点会聚合程序的健康指标来检查程序的健康情况，默认配置值是never，不会显示详细信息，还有when-authorized和always两个选项值可配置。

management:
    endpoint:
      health:
        show-details: always

配置成always后访问/jinglingwang/health 会展示更多的信息，效果图如下：

禁用组件自动配置

Spring Boot会自动配置key为couchbase、datasource、diskspace、elasticsearch、hazelcast、influxdb、jms、ldap、mail、mongo、neo4j、ping、rabbit、redis、solr等运行状况指示器，可以通过management.health.key.enabled来配置启用/禁用指示器。

比如，禁用redis健康监测：

management:
    health:
      redis:
        enabled: false

重启之后，再访问/jinglingwang/health端点，里面就不会再有redis相关的信息了。

特殊的 shutdown 端点

默认情况下，除shutdown之外的所有端点都处于启用状态。
```
management:
  endpoint:
    shutdown:
      enabled: true
```
注意是endpoint，不是endpoints 生产环境使用需慎重！
让后Post访问http://localhost:9000/jinglingwang/shutdown 就可以实现停机操作。

自定义健康指标

定制运行状况信息，可以注册实现 HealthIndicator 接口的springbean。需要提供health()方法的实现并返回一个健康响应。运行状况响应应该包括状态，并且可以选择包括要显示的其他详细信息。

@Component
public class MyHealthIndicator implements HealthIndicator{

    @Override
    public Health health() {
        //int errorCode = checkError();  //自定义逻辑
        int errorCode = 1;
        if (errorCode != 1) {
            return Health.down().withDetail("Error Code", "https://jingling.im/ not found").build();
        }
        return Health.up()
                .withDetail("code","200")
                .withDetail("message","https://jingling.im/")
                .build();
    }

}

端点安全校验

默认暴露的端点是没有任何安全校验的，敏感信息很容易就暴露了，如果把生产环境的全部端点都暴露在外网环境下，是非常可怕的一件事，这时候我们就可以结合Spring Security来做安全校验。

引入Security相关依赖

<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-security</artifactId>
</dependency>

配置好用户名密码

spring:
  security:
    user:
      name: jinglingwang
      password: jingling.im
      roles: ENDPOINT_ADMIN

端点访问配置

@Configuration(proxyBeanMethods = false)
public class ActuatorSecurity extends WebSecurityConfigurerAdapter{

    @Override
    protected void configure(HttpSecurity http) throws Exception {
        // 确保所有端点都具有ENDPOINT_ADMIN角色
        http.requestMatcher(EndpointRequest.toAnyEndpoint()).authorizeRequests((requests) ->
                requests.anyRequest().hasRole("ENDPOINT_ADMIN"));
        http.httpBasic();
    }
}

整合监控页面

基于 Prometheus 和 Dashboard(如Grafana)进行监控数据可视化展示。

项目引入Prometheus

Prometheus 是一个通过从监控目标上采集 HTTP 数据收集指标数据的监控平台，用户可以非常方便的安装和使用Prometheus并且能够非常方便的对其进行扩展。

引入Prometheus依赖

由于我们使用springboot的是2.X，其中已经包含了io.micrometer相关的包，不需要再次引入，如果使用的springboot版本是1.X，需要自己额外引入。
```
<dependency>
  <groupId>io.micrometer</groupId>
  <artifactId>micrometer-core</artifactId>
  <version>${micrometer.version}</version>
</dependency>
```

引入micrometer-registry-prometheus包，提供了基于actuator的端点，访问路径是**/prometheus

<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
    <scope>runtime</scope>
</dependency>

开放所有端点，或者至少/metrics端点
```
management.endpoint.metrics.enabled=true
```
然后访问http://localhost:8181/jinglingwang/prometheus，即可看到相关数据

安装Prometheus

下载Prometheus：https://prometheus.io/download/ 我本地环境是Win10
直接解压到自定义目录

修改prometheus.yml 需要配置scrape_configs这个节点

```yaml
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'jinglingwang'
    metrics_path: /actuator/prometheus
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    static_configs:
    - targets: ['localhost:8181']
```

我这个项目的端口是8181，所以targets需要注意一下，以及获取数据的访问路径是`/jinglingwang/prometheus`。

启动Prometheus Win10 直接运行prometheus.exe就可以了，默认端口是9090.
访问控制台页面： http://localhost:9090/classic/targets；

接入Grafana

Grafana是一个开源的指标量监测和可视化工具。常用于展示基础设施的时序数据和应用程序运行分析。Grafana的dashboard展示非常炫酷，绝对是运维提升逼格的一大利器。

Grafana只是一个dashboard(4版本开始将引入报警功能)，负责把数据库中的数据进行可视化展示，本身并不存储任何数据。Grafana目前支持的时序数据库有: Graphite, Prometheus, Elasticsearch, InfluxDB, OpenTSDB, AWS Cloudwatch。

Grafana的套路基本上跟kibana差不多，都是根据查询条件设置聚合规则，在合适的图表上进行展示，多个图表共同组建成一个dashboard，熟悉kibana的用户应该可以非常容易上手。

官方在线演示Demo： http://play.grafana.org/