연구 분야: Software Development
학회: Discover Data
Production systems are often complex and distributed in nature. Debugging such complex systems is often time-consuming and nuanced, requiring software engineers (SEs) to understand vast codebases and the context of their organization’s engineering setup. This leads to a huge cognitive load on SEs, particularly while they are on-call where they are expected to fix critical time-sensitive problems. This often causes sub-optimal human performance, which in turn decreases productivity and potentially impacts system reliability due to prolonged downtimes. In this paper, we introduce DebugMate (Observe.AI. Debugmate. 2024. https://github.com/Observeai-Research/DebugMate), an AI Agent that uniquely integrates an organization’s internal context with external knowledge sources. DebugMate connects to the organization’s key system resources like documentation, codebase, and knowledge base of historical incidents, along with online developer platforms (eg. StackOverflow, GitHub) and helps SEs respond faster by generating multiple hypotheses for identifying the root cause of a production issue. DebugMate employs Retrieval-Augmented Generation (RAG), ReAct, Tree-of-Thought, and long-term memory to provide grounded hypotheses for debugging via structured and systematic self-planning. In addition, it utilizes graphical representations to build context of not only the organization’s code but also of imported code modules, to specifically identify and resolve complex issues that may be caused by unfamiliar external frameworks (eg, SpringBoot). Our proposed approach increases accuracy in identifying an issue by 20% compared to the baseline. On our historical reliability incidents, DebugMate achieves a 77% success rate in identifying root causes and suggesting fixes.
| 발행 연도 | 2025년 |
|---|---|
| 인용수 | 0 |
| 출판 국가 | Gabon, Anguilla |
| 사이트 | Springer |
| 좋아요 수 | 0 |