skills/bms-log-test-query/SKILL.md

136 lines
4.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
name: bms-log-test-query
description: >
Query BMS (bms-sit) application logs from Elasticsearch via Kibana console proxy.
Use when the user asks to check BMS logs, search BMS errors, or look up recent log entries.
All queries go through Kibana at http://172.17.12.18:8000 — ES direct port is NOT accessible.
metadata:
author: local
version: 3.0.0
---
# BMS Log Query Skill
> **Scope: ONLY `bms-sit` data view → `bms-test*` and `pms*` indices.**
## Connection Details (DO NOT re-verify — confirmed working)
- **Auth**: Read from `~/.env` (home directory):
- `BMS_LOG_TEST_URL` = Kibana proxy URL
- `BMS_LOG_TEST_USERNAME` = elastic
- `BMS_LOG_TEST_PASSWORD` = (stored in .env)
- **ES Version**: 8.6.1
- **ES direct port**: NOT accessible. All queries go through Kibana console proxy.
## Data View Mapping
| Kibana Data View | ES Index Pattern |
|-----------------|------------------|
| `bms-sit` | `bms-test*, pms*` |
## Kibana Console Proxy Format
```
POST http://172.17.12.18:8000/api/console/proxy?path=<URL_ENCODED_ES_PATH>&method=<HTTP_METHOD>
```
Headers: `kbn-xsrf: true`, `Content-Type: application/json`
## Index Pattern
- `bms-test-YYYY-MM-DD` — daily rolling indices, ~2,000,000 docs/day
- `pms-test-YYYY-MM-DD` — PMS test logs, ~59,000 docs/day
## Query Patterns
### Latest N logs
```json
POST /api/console/proxy?path=/bms-test-<DATE>/_search&method=GET
{
"sort": [{"@timestamp": "desc"}],
"size": 10
}
```
### Search by keyword
```json
POST /api/console/proxy?path=/bms-test-<DATE>/_search&method=GET
{
"query": {
"multi_match": {
"query": "<keyword>",
"fields": ["message", "error.message", "original_message"]
}
},
"sort": [{"@timestamp": "desc"}],
"size": 20
}
```
### Search errors in time range
```json
POST /api/console/proxy?path=/bms-test-<DATE>/_search&method=GET
{
"query": {
"bool": {
"must": [
{ "range": { "@timestamp": { "gte": "now-1h", "lte": "now" } } },
{ "match_phrase": { "message": "ERROR" } }
]
}
},
"size": 20,
"sort": [{"@timestamp": "desc"}]
}
```
### Count docs
```
GET /api/console/proxy?path=/bms-test-<DATE>/_count&method=GET
```
### Get mapping (available fields)
```
GET /api/console/proxy?path=/bms-test-<DATE>/_mapping&method=GET
```
## Log Fields
| 字段 | 说明 |
|------|------|
| `@timestamp` | ES 时间戳 (ISO 8601) |
| `timestamp` | 原始时间字符串 |
| `message` | 日志正文 (中文/英文) |
| `original_message` | 原始未格式化消息 |
| `level` | 日志级别 (INFO, WARN, ERROR) |
| `app_name` | 应用名 (如 bms-web) |
| `class` | Java 类名 |
| `thread` | 线程名 (如 `http-nio-8081-exec-59`) |
| `traceId` | SkyWalking 链路追踪 ID |
| `parentTraceId` | 父级追踪 ID |
| `stack_trace` | 异常堆栈 (无异常时为空) |
| `host_ip` | 主机 IP |
| `log_origin` | 日志来源标识 |
## Troubleshooting Guide — 排查链路问题
排查业务链路问题时,结合以下三个核心维度:
1. **message + 时间戳** — 定位具体操作和发生时间,快速缩小范围
2. **traceId** — SkyWalking 分布式链路追踪 ID可贯穿整个调用链前端 → 网关 → 服务A → 服务B → DB
- 大部分业务场景下 traceId 可完整贯穿
- **例外**xxljob 定时任务、dubbo 服务互相调用可能丢失 traceId
3. **thread** — 单机线程名,辅助定位具体执行线程
- 测试环境通常单实例thread 可直接定位
- **生产环境注意分布式问题**:同一线程名可能出现在不同机器上,需结合 `host_ip` 一起使用
**推荐排查流程:**
- 已知现象 → 用 message 关键词 + 时间范围找到第一条相关日志 → 提取 traceId → 用 traceId 查出完整链路 → 结合 thread + host_ip 定位具体节点
## Rules
1. **Never re-probe ES connectivity** — Kibana proxy is the only working method
2. **Never try ports 9200/9201/5601** — not accessible
3. **Never store credentials** in this file
4. **When user says "查 bms-sit" → query `bms-test-*` indices**