使用 Kibana Alerts 主動通知異常狀況

本篇學習重點

使用 Kibana Alerts 之前的準備工作。
了解 Elastic Observability 針對 Kibana Alerts 的整合上有提供哪些 Alert 的設置。

使用 Kibana Alerts

在使用 Kibana Alerts 之前，有以下幾點需要注意及準備的項目：

準備及設定好加密的密鑰

由於 Kibana 中的敏感資料和 Alerting Rules (警報規則)，儲存時都會進行加密，所以要先準備好加密的密鑰。

密鑰需設定在 kibana.yml 中的 xpack.encryptedSavedObjects.encryptionKey 設定之中，以下是協助產生密鑰的工具及指令：

bin/kibana-encryption-keys generate

設定好對外開放存取的 Base URL

要讓 Alert 發送通知到外部之後，可以透過連結導回 Kibana，所以會需要設定好外部存取 Kibana 時有效的 Base URL，設定在 kibana.yml 中的 server.publicBaseUrl 。

使用 TLS 安全的連線

如果有啟用 Elastic 的 Security 功能時，必須要設定並啟用 TLS 連線。(可參考官方文件 - Setup basic security for the Elastic Stack plus secured HTTPS traffic )

為了在背景執行的處理程式能安全的存取 Elasticsearch，Kibana alerting 會使用 API key 的安全認證機制，而要使用 API key 的話，就一定要啟用 TLS。

要讓 Elasticsearch 與 Kibana 之間使用 TLS 連線，會有以下幾個設置步驟：

使用 elasticsearch-certutil 配合 http 參數，來建立 TLS 相關的 certificate。

如果你的環境之中沒有 CA (Certificate Authorities)，又或是先前沒有建立過 CA 的話，會要先透過 bin/elasticsearch-certutil 配合 ca 參數來建立 CA。

建立完成 http 使用的 certificate 之後，壓縮檔裡面會有以下的內容：

將 kibana/elasticsearch-ca.pem 檔案，複制到 Kibana 的 config 目錄中，並修改 kibana.yml 中的 elasticsearch.ssl.certificateAuthorities 設定，以及將 Elasticsearch 的 protocol 改成 HTTPS：

elasticsearch.ssl.certificateAuthorities: $KBN_PATH_CONF/elasticsearch-ca.pem
elasticsearch.hosts: ["https://localhost:9200"]

同時也要確保 Elasticsearch 也啟用 TLS 連線有正確設定，並且將先前產生的 http.p12 複制到 config 目錄底下。

xpack.security.http.ssl.enabled: true
xpack.security.http.ssl.keystore.path: http.p12

重新啟動 Elasticsearch 與 Kibana 。

建立好 Connector

Kibana Alerts 有支援多種的 Connectors 如下：

寄 Email
寫入 Elasticsearch Index
建立 Jira incident
PageDuty
Slack
Webhook
IBM Resilient
Microsoft Teams
寫到 Kibana 的 ServerLog
建立 ServiceNow incident
建立 Swimlane incident

在 Kibana > Stack Management > Rules and Connectors > Connector 裡可以進行設定，有些與 Security 相關的，可能會要使用到 keystore 來保存密鑰，細節相關的設定，可以參考官方文件 - Kibana Action Types 的說明。

這些 Connector 建立好之後，就會是 Alert 最後要發送通知時的接口。

Elastic Observability 對於 Alerts 的整合與支援

Elastic Observability 的功能之中，有針對 Alerts 進行整合，也就是在 Observability 的 UI 當中，針對該服務的特性，有建立能較快速建立 Alerts Rules (規則) 的功能，以下針對 Alerts 設置的部份進行說明。

Logs

Log 的部份，能快速針對 Log threshold (門檻) 進行 Alerts 的設置，設置重點如下：

在一段時間區間之中，針對一個特定條件下的 logs 數量，或某兩種條件之間的 logs 數量的比例，達到一個門檻時，發出 alert。
同時這個計算時，能依照特定欄位的值來進行分群 (group by)，也就是上述的規則可以限定在某些條件的分類上執行，例如每台機器分開計算、或是每種服務分開計算。

設定細節可以參考官方文件 - Observability - Create a logs threshold rule。

Metrics

Metrics 在 Inventory 與 Metric Explorer 的頁面當中，分類提供不同的 Alerts 設置。

Inventory - Infrastructure Alerts

在 Inventory 當中，關注的重點就是 Infrastructure，不論是 Host、Docker、Kubernetes 或是 AWS 雲端主機或服務。

所以針對 Inventory 的部份，Alerts 的主要設置重點如下：

當指定的主機或服務，在最近一段時間內，在任何 metrics 的條件下 (例如：CPU 或 Memory 使用率、網路流量、Log rate、甚至是任何一個欄位的平均數…等) 達到指定的門檻時，就發出 Alert。
除了以上的 metrics 條件外，若是在最近一段時間內，沒有任何的 logs 產生，也可以發出 Alert。
另外除了 Alert 的報警之外，也可以定義 Warning 層級的條件。
與 Metrics 中 Alert 設定的差別，主要在於 Inventory 會指定主機或服務相關的範圍。