最近要对一些业务流程进行端到端的监控,这些业务是由几个微服务构成,微服务都是Java Spring编写的,我们需要了解整个业务涉及的各个模块的流量统计,性能状况,例如总共有多少次业务请求调用,多少次成功或失败的回复,每个步骤的耗时是多少等等。因此我也研究了一下如何在Java Spring应用中输出统计指标,通过Prometheus来统一收集指标,并在Grafana中通过不同的报表来呈现这些信息。
首先我们先定义一个简单的业务流程,假设我们有两个Spring的应用,一个是提供业务请求接口的HTTP调用,在收到业务请求后,把里面携带的信息发送到Kafka。另一个应用是订阅Kafka的消息,获取应用一发出的业务数据,并进行处理。
应用一
在start.spring.io网站里面新建一个应用,artifact的名字为kafka-sender-example,Dependancies里面选择Apache kafka for spring, Actuator, Spring Web。打开生成的项目文件,添加一个名为RemoteCommandController的类,实现一个http接口,代码如下:
- package cn.roygao.kafkasenderexample;
-
- import java.util.Collections;
- import java.util.Map;
- import java.util.UUID;
- import java.util.concurrent.ExecutionException;
- import java.util.concurrent.TimeUnit;
- import java.util.concurrent.TimeoutException;
- import java.util.logging.Logger;
-
- import org.apache.kafka.clients.producer.ProducerRecord;
- import org.springframework.beans.factory.annotation.Autowired;
- import org.springframework.http.ResponseEntity;
- import org.springframework.kafka.core.KafkaTemplate;
- import org.springframework.web.bind.annotation.PostMapping;
- import org.springframework.web.bind.annotation.RequestBody;
- import org.springframework.web.bind.annotation.RestController;
-
- import com.alibaba.fastjson.JSONObject;
-
- @RestController
- public class RemoteCommandController {
- @Autowired
- private KafkaTemplate
template; -
- private final static Logger LOGGER = Logger.getLogger(RemoteCommandController.class.getName());
-
- @PostMapping("/sendcommand")
- public ResponseEntity
- String requestId = UUID.randomUUID().toString();
- String vin = commandMsg.getString("vin");
- String command = commandMsg.getString("command");
- LOGGER.info("Send command to vehicle:" + vin + ", command:" + command);
- Map
requestIdObj = Collections.singletonMap("requestId", requestId); - ProducerRecord
record = new ProducerRecord<>("remotecommand", 1, command); - try {
- System.out.println(System.currentTimeMillis());
- template.send(record).get(10, TimeUnit.SECONDS);
- }
- catch (ExecutionException e) {
- LOGGER.info("Error");
- LOGGER.info(e.getMessage());
- }
- catch (TimeoutException | InterruptedException e) {
- LOGGER.info("Timeout");
- LOGGER.info(e.getMessage());
- }
- return ResponseEntity.accepted().body(requestIdObj);
- }
- }
这个代码很简单,提供了一个POST的/sendcommand的接口,用户调用这个接口,提供车辆的VIN号和要发送的指令信息,收到请求之后,将把这些业务请求信息转发到Kafka的消息主题。这里用到了KafkaTemplate来进行消息的发送。为此,定义一个名为KafkaSender的配置类,代码如下:
- package cn.roygao.kafkasenderexample;
-
- import java.util.HashMap;
- import java.util.Map;
-
- import org.apache.kafka.clients.admin.NewTopic;
- import org.apache.kafka.clients.producer.ProducerConfig;
- import org.apache.kafka.common.serialization.IntegerSerializer;
- import org.apache.kafka.common.serialization.StringSerializer;
- import org.springframework.context.annotation.Bean;
- import org.springframework.context.annotation.Configuration;
- import org.springframework.kafka.config.TopicBuilder;
- import org.springframework.kafka.core.DefaultKafkaProducerFactory;
- import org.springframework.kafka.core.KafkaTemplate;
- import org.springframework.kafka.core.ProducerFactory;
-
- @Configuration
- public class KafkaSender {
- @Bean
- public NewTopic topic() {
- return TopicBuilder.name("remotecommand")
- .build();
- }
-
- @Bean
- public ProducerFactory
producerFactory() { - return new DefaultKafkaProducerFactory<>(producerConfigs());
- }
-
- @Bean
- public Map
producerConfigs() { - Map
props = new HashMap<>(); - props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
- props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, IntegerSerializer.class);
- props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
- // See https://kafka.apache.org/documentation/#producerconfigs for more properties
- return props;
- }
-
- @Bean
- public KafkaTemplate
kafkaTemplate() { - return new KafkaTemplate
(producerFactory()); - }
- }
代码里面定义了Kafka服务器的地址,消息主题等配置。
运行./mvnw clean package进行编译打包。
应用二
在start.spring.io网站里面新建一个应用,artifact的名字为kafka-sender-example,Dependancies里面选择Apache kafka for spring, Actuator。打开生成的项目文件,新建一个名为RemoteCommandHandler的类,实现接收Kafka信息的功能,代码如下:
- package cn.roygao.kafkareceiverexample;
-
- import java.util.concurrent.TimeUnit;
- import org.springframework.kafka.annotation.KafkaListener;
- import org.springframework.kafka.listener.adapter.ConsumerRecordMetadata;
- import org.springframework.stereotype.Component;
-
- import io.micrometer.core.instrument.MeterRegistry;
- import io.micrometer.core.instrument.Timer;
-
- @Component
- public class RemoteCommandHandler {
- private Timer timer;
-
- public RemoteCommandHandler(MeterRegistry registry) {
- this.timer = Timer
- .builder("kafka.process.latency")
- .publishPercentiles(0.15, 0.5, 0.95)
- .publishPercentileHistogram()
- .register(registry);
- }
-
- @KafkaListener(id = "myId", topics = "remotecommand")
- public void listen(String in, ConsumerRecordMetadata meta) {
- long latency = System.currentTimeMillis()-meta.timestamp();
- timer.record(latency, TimeUnit.MILLISECONDS);
- }
- }
这里类的构造函数需要传入一个MeterRetistry的对象,然后新建一个Timer对象,这是Micrometer提供的四种Metric之一,可以用来记录时长的信息。把这个Timer注册到MeterRegistry。
在listen方法中,定义了从Kafka的消息主题订阅消息,获取消息的metadata中的生成时间的时间戳,并与当前的时间进行比较,计算出从消息生成到消息消费的耗时,然后用timer来进行计算。Timer会按照之前的定义进行不同百分位区间的分布统计。
同样我们也需要定义一个Kafka的配置类,代码如下:
- package cn.roygao.kafkareceiverexample;
-
- import java.util.HashMap;
- import java.util.Map;
-
- import org.apache.kafka.clients.producer.ProducerConfig;
- import org.springframework.context.annotation.Bean;
- import org.springframework.context.annotation.Configuration;
- import org.springframework.kafka.annotation.EnableKafka;
- import org.springframework.kafka.config.ConcurrentKafkaListenerContainerFactory;
- import org.springframework.kafka.config.KafkaListenerContainerFactory;
- import org.springframework.kafka.core.ConsumerFactory;
- import org.springframework.kafka.core.DefaultKafkaConsumerFactory;
- import org.springframework.kafka.listener.ConcurrentMessageListenerContainer;
-
- @Configuration
- @EnableKafka
- public class KafkaConfig {
- @Bean
- KafkaListenerContainerFactory
> - kafkaListenerContainerFactory() {
- ConcurrentKafkaListenerContainerFactory
factory = - new ConcurrentKafkaListenerContainerFactory<>();
- factory.setConsumerFactory(consumerFactory());
- factory.setConcurrency(3);
- factory.getContainerProperties().setPollTimeout(3000);
- return factory;
- }
-
- @Bean
- public ConsumerFactory
consumerFactory() { - return new DefaultKafkaConsumerFactory<>(consumerConfigs());
- }
-
- @Bean
- public Map
consumerConfigs() { - Map
props = new HashMap<>(); - props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
- props.put("key.deserializer", "org.apache.kafka.common.serialization.IntegerDeserializer");
- props.put("value.deserializer", "org.apache.kafka.common.serialization.StringDeserializer");
- return props;
- }
- }
在application.properties文件中添加以下配置:
- spring.kafka.consumer.auto-offset-reset=earliest
- server.port=7777
- management.endpoints.web.exposure.include=health,info,prometheus
- management.endpoints.enabled-by-default=true
- management.endpoint.health.show-details: always
然后运行./mvnw clean package进行编译打包。
启动Kafka
这里我采用Docker的方式来启动Kafka,compose文件的内容如下:
- ---
- version: '2'
- services:
- zookeeper:
- image: confluentinc/cp-zookeeper:6.1.0
- hostname: zookeeper
- container_name: zookeeper
- ports:
- - "2181:2181"
- environment:
- ZOOKEEPER_CLIENT_PORT: 2181
- ZOOKEEPER_TICK_TIME: 2000
-
- broker:
- image: confluentinc/cp-server:6.1.0
- hostname: broker
- container_name: broker
- depends_on:
- - zookeeper
- ports:
- - "9092:9092"
- - "9101:9101"
- environment:
- KAFKA_BROKER_ID: 1
- KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
- KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
- KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
- KAFKA_METRIC_REPORTERS: io.confluent.metrics.reporter.ConfluentMetricsReporter
- KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
- KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 0
- KAFKA_CONFLUENT_LICENSE_TOPIC_REPLICATION_FACTOR: 1
- KAFKA_CONFLUENT_BALANCER_TOPIC_REPLICATION_FACTOR: 1
- KAFKA_TRANSACTION_STATE_LOG_MIN_ISR: 1
- KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR: 1
- KAFKA_JMX_PORT: 9101
- KAFKA_JMX_HOSTNAME: localhost
- KAFKA_CONFLUENT_SCHEMA_REGISTRY_URL: http://schema-registry:8081
- CONFLUENT_METRICS_REPORTER_BOOTSTRAP_SERVERS: broker:29092
- CONFLUENT_METRICS_REPORTER_TOPIC_REPLICAS: 1
- CONFLUENT_METRICS_ENABLE: 'true'
- CONFLUENT_SUPPORT_CUSTOMER_ID: 'anonymous'
-
- schema-registry:
- image: confluentinc/cp-schema-registry:6.1.0
- hostname: schema-registry
- container_name: schema-registry
- depends_on:
- - broker
- ports:
- - "8081:8081"
- environment:
- SCHEMA_REGISTRY_HOST_NAME: schema-registry
- SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:29092'
- SCHEMA_REGISTRY_LISTENERS: http://0.0.0.0:8081
-
- connect:
- image: cnfldemos/cp-server-connect-datagen:0.4.0-6.1.0
- hostname: connect
- container_name: connect
- depends_on:
- - broker
- - schema-registry
- ports:
- - "8083:8083"
- environment:
- CONNECT_BOOTSTRAP_SERVERS: 'broker:29092'
- CONNECT_REST_ADVERTISED_HOST_NAME: connect
- CONNECT_REST_PORT: 8083
- CONNECT_GROUP_ID: compose-connect-group
- CONNECT_CONFIG_STORAGE_TOPIC: docker-connect-configs
- CONNECT_CONFIG_STORAGE_REPLICATION_FACTOR: 1
- CONNECT_OFFSET_FLUSH_INTERVAL_MS: 10000
- CONNECT_OFFSET_STORAGE_TOPIC: docker-connect-offsets
- CONNECT_OFFSET_STORAGE_REPLICATION_FACTOR: 1
- CONNECT_STATUS_STORAGE_TOPIC: docker-connect-status
- CONNECT_STATUS_STORAGE_REPLICATION_FACTOR: 1
- CONNECT_KEY_CONVERTER: org.apache.kafka.connect.storage.StringConverter
- CONNECT_VALUE_CONVERTER: io.confluent.connect.avro.AvroConverter
- CONNECT_VALUE_CONVERTER_SCHEMA_REGISTRY_URL: http://schema-registry:8081
- # CLASSPATH required due to CC-2422
- CLASSPATH: /usr/share/java/monitoring-interceptors/monitoring-interceptors-6.1.0.jar
- CONNECT_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
- CONNECT_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
- CONNECT_PLUGIN_PATH: "/usr/share/java,/usr/share/confluent-hub-components"
- CONNECT_LOG4J_LOGGERS: org.apache.zookeeper=ERROR,org.I0Itec.zkclient=ERROR,org.reflections=ERROR
-
- control-center:
- image: confluentinc/cp-enterprise-control-center:6.1.0
- hostname: control-center
- container_name: control-center
- depends_on:
- - broker
- - schema-registry
- - connect
- - ksqldb-server
- ports:
- - "9021:9021"
- environment:
- CONTROL_CENTER_BOOTSTRAP_SERVERS: 'broker:29092'
- CONTROL_CENTER_CONNECT_CLUSTER: 'connect:8083'
- CONTROL_CENTER_KSQL_KSQLDB1_URL: "http://ksqldb-server:8088"
- CONTROL_CENTER_KSQL_KSQLDB1_ADVERTISED_URL: "http://localhost:8088"
- CONTROL_CENTER_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
- CONTROL_CENTER_REPLICATION_FACTOR: 1
- CONTROL_CENTER_INTERNAL_TOPICS_PARTITIONS: 1
- CONTROL_CENTER_MONITORING_INTERCEPTOR_TOPIC_PARTITIONS: 1
- CONFLUENT_METRICS_TOPIC_REPLICATION: 1
- PORT: 9021
-
- ksqldb-server:
- image: confluentinc/cp-ksqldb-server:6.1.0
- hostname: ksqldb-server
- container_name: ksqldb-server
- depends_on:
- - broker
- - connect
- ports:
- - "8088:8088"
- environment:
- KSQL_CONFIG_DIR: "/etc/ksql"
- KSQL_BOOTSTRAP_SERVERS: "broker:29092"
- KSQL_HOST_NAME: ksqldb-server
- KSQL_LISTENERS: "http://0.0.0.0:8088"
- KSQL_CACHE_MAX_BYTES_BUFFERING: 0
- KSQL_KSQL_SCHEMA_REGISTRY_URL: "http://schema-registry:8081"
- KSQL_PRODUCER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringProducerInterceptor"
- KSQL_CONSUMER_INTERCEPTOR_CLASSES: "io.confluent.monitoring.clients.interceptor.MonitoringConsumerInterceptor"
- KSQL_KSQL_CONNECT_URL: "http://connect:8083"
- KSQL_KSQL_LOGGING_PROCESSING_TOPIC_REPLICATION_FACTOR: 1
- KSQL_KSQL_LOGGING_PROCESSING_TOPIC_AUTO_CREATE: 'true'
- KSQL_KSQL_LOGGING_PROCESSING_STREAM_AUTO_CREATE: 'true'
-
- ksqldb-cli:
- image: confluentinc/cp-ksqldb-cli:6.1.0
- container_name: ksqldb-cli
- depends_on:
- - broker
- - connect
- - ksqldb-server
- entrypoint: /bin/sh
- tty: true
-
- ksql-datagen:
- image: confluentinc/ksqldb-examples:6.1.0
- hostname: ksql-datagen
- container_name: ksql-datagen
- depends_on:
- - ksqldb-server
- - broker
- - schema-registry
- - connect
- command: "bash -c 'echo Waiting for Kafka to be ready... && \
- cub kafka-ready -b broker:29092 1 40 && \
- echo Waiting for Confluent Schema Registry to be ready... && \
- cub sr-ready schema-registry 8081 40 && \
- echo Waiting a few seconds for topic creation to finish... && \
- sleep 11 && \
- tail -f /dev/null'"
- environment:
- KSQL_CONFIG_DIR: "/etc/ksql"
- STREAMS_BOOTSTRAP_SERVERS: broker:29092
- STREAMS_SCHEMA_REGISTRY_HOST: schema-registry
- STREAMS_SCHEMA_REGISTRY_PORT: 8081
-
- rest-proxy:
- image: confluentinc/cp-kafka-rest:6.1.0
- depends_on:
- - broker
- - schema-registry
- ports:
- - 8082:8082
- hostname: rest-proxy
- container_name: rest-proxy
- environment:
- KAFKA_REST_HOST_NAME: rest-proxy
- KAFKA_REST_BOOTSTRAP_SERVERS: 'broker:29092'
- KAFKA_REST_LISTENERS: "http://0.0.0.0:8082"
- KAFKA_REST_SCHEMA_REGISTRY_URL: 'http://schema-registry:8081'
运行nohup docker compose up > ./kafka.log 2>&1 &即可启动。在浏览器输入localhost:9021,可以在控制台界面观看Kafka的相关信息。
分别运行应用一和应用二,然后调用POST http://localhost:8080/remotecommand接口发送业务请求,例如以下的命令:
- curl --location --request POST 'http://localhost:8080/sendcommand' \
- --header 'Content-Type: application/json' \
- --data-raw '{
- "vin": "ABC123",
- "command": "engine-start"
- }'
在Kafka的控制台可以看到有一个remotecommand的消息主题,并且有一条信息发送和被消费。
启动Prometheus和Grafana
同样采用docker compose的方式来启动,compose文件内容如下:
- services:
- prometheus:
- image: prom/prometheus-linux-amd64
- #network_mode: host
- container_name: prometheus
- restart: unless-stopped
- volumes:
- - ./config:/etc/prometheus/
- command:
- - '--config.file=/etc/prometheus/prometheus.yaml'
- ports:
- - 9090:9090
- grafana:
- image: grafana/grafana
- user: '472'
- #network_mode: host
- container_name: grafana
- restart: unless-stopped
- links:
- - prometheus:prometheus
- volumes:
- - ./data/grafana:/var/lib/grafana
- environment:
- - GF_SECURITY_ADMIN_PASSWORD=admin
- ports:
- - 3000:3000
- depends_on:
- - prometheus
在这个compose文件的目录下新建一个config目录,里面存放prometheus的配置文件,内容如下:
- scrape_configs:
- - job_name: 'Spring Boot Application input'
- metrics_path: '/actuator/prometheus'
- scrape_interval: 2s
- static_configs:
- - targets: ['172.17.0.1:7777']
- labels:
- application: 'My Spring Boot Application'
这里面的targets配置的是应用二暴露的地址,metrics_path是采集指标的路径。
在compose文件的目录下新建一个data/grafana目录,挂载给Grafana的文件目录,注意这里需要用chmod 777来修改目录权限,不然Grafana会报权限错误。
运行nohup docker compose up > ./prometheus.log 2>&1 &运行即可。
打开localhost:9090可以访问prometheus的页面,然后我们可以输入kafka进行搜索,可以看到应用二上报的kafka_process_latency的指标数据,按照我们的定义进行了0.15,0.5, 0.95这三个百分位区间的统计。
打开localhost:3000可以访问Grafana的页面,配置datasource,选择Prometheus这个容器的地址,然后save&test。之后可以新建一个dashboard,然后可以在报表里面显示kafka_process_latency的指标图形。
【未完待续】,还要增加对Http接口调用的Counter metric,以及在Grafana定义更多的报表,包括其他服务指标等等。