最近一直在探索和研究智能运维平台的可落地方案,说实话难度很大,因为很多细节在当前的技术背景下落地难度还是有点大。我们不妨曲线救国,与其做平台要考虑各种复杂场景,不如先实现和落地某一项功能模块。所以,当前我研究的方向为自动化运维智能体!
kubelet、K8s组件、应用SLO等告警规则namespace,pod,container,severityLogQL提取错误日志(如Exception,OOMKilled)| 智能告警分析 | |
| 自动修复 | |
| 预测性维护 | |
| 自然语言交互 | |
| 知识库管理 |
建议在明天10:00前增加3个节点工具集成:
#Dify工具定义示例tools=[{"name":"query_prometheus","description":"查询Prometheus指标","parameters":{"query":{"type":"string","description":"
romQL表达式"},"time_range":{"type":"string","description":"如1h"}}},{"name":"execute_k8s_action","description":"执行K8s操作","parameters":{"action":{"type":"string","enum":["restart_pod","scale_deployment"]},"target":{"type":"string","description":"资源名称"}}}]| 欢迎光临 链载Ai (http://www.lianzai.com/) | Powered by Discuz! X3.5 |