Contributor
Colton G

CAPA: Ghidra Integration


Mentors
Willi Ballenthin, Mike Hunhoff, Blas Kojusner, Elliot Chernofsky, Conor Quigley
Organization
FLARE
Technologies
java, x86, Python3, Ghidra, IDA Pro, Vivisect, dnfile
Topics
security, reverse engineering, emulation, disassembly, Binary Analysis
CAPA is the FLARE team’s open-source tool to identify program capabilities using an extensible rule set. Each rule is matched against features that CAPA extracts from a program. Extracted features include file-level features such as strings, section names, imports, and exports and function-level features such as API calls, string and byte references, instruction mnemonics, and number constants. CAPA uses feature extractors, called "backends", to extract features from supported file types (PE, ELF, and .NET) and architectures (32- and 64-bit x86). Each backend is built around an existing tool or library that provides file parsing and disassembly capabilities. CAPA uses this to extract features. CAPA currently implements backends using Vivisect, IDA Pro, and dnfile. Ghidra is a popular open-source disassembly framework with a robust API to access its analysis. Programs can interact with a wealth of information that includes parsed file formats and disassembled code. The goal of this project is to develop a Ghidra backend for CAPA using Python 3 (via Ghidrathon) and Ghidra’s scripting API. Users should be able to invoke CAPA such that it uses Ghidra’s analysis engine and/or invoke CAPA from within Ghidra.