Core API¶
New in version 1.0.
This section documents the FIDL core API, and it’s intended for developers of IDA plugins
API Overview¶
-
class
decompiler_utils.
BBGraph
(f_ea)¶ Representation of the assembly CFG for a function
-
find_connected_paths
(bb_start, bb_end, co=10)¶ Leverages NetworkX to find all connected paths
- Parameters
bb_start (Basic block) – Initial basic block
bb_end (Basic block) – Final basic block
co (int, optional) – Cutoff parameter
NOTE: the cutoff parameter in
nx.all_simple_paths
serves two purposes:reduce the chances of CPU melting (algo is O(n!))
nobody will inspect (manually) monstruous paths
- Returns
generator of lists or None
-
get_node
(addr)¶ Given a function’s address, returns the basic block (address) that contains it (or None)
- Parameters
addr (int) – address within a function
- Returns
Address of the node containing the input address
- Return type
int
-
-
decompiler_utils.
NonLibFunctions
(start_ea=None, min_size=0)¶ Generator yielding only non-lib functions
- Parameters
start_ea (int, optional) – Address to start looking for non-library functions.
min_size (int, optional) – Minimum function size. Useful to filter small, uninteresting functions.
-
decompiler_utils.
all_paths_between
(c, start_node=None, end_node=None, co=40)¶ Calculates all paths between
start_node
andend_node
Calculating paths is one of these things that is better done with the paralell index graph (
c.i_cfg
) It haywires when done with complex elements.FIXME: the co (cutoff) param is necessary to avoid complexity explosion. However, there is a problem if it’s reached…
- Parameters
c (
controlFlowinator
) – acontrolFlowinator
objectstart_node (
cexpr_t
) – acontrolFlowinator
nodestart_node – a
controlFlowinator
nodeco (int, optional) – the cutoff value controls the maximum path length.
- Returns
it yields a list of nodes for each path
- Return type
list
-
decompiler_utils.
assigns_to_var
(cex)¶ Does this :class:
cexpr_t
assign a value to any variable?TODO: this is limited for now to expressions of the type:
v1 = something something
- Parameters
cex (
cexpr_t
) – acexpr_t
object- Returns
the assigned var index (to
cf.lvars
array) or -1 if thecexpr_t
does not assign to any variable- Return type
int
-
decompiler_utils.
blowup_expression
(cex, final_operands=None)¶ Extracts all elements of an expression
Ex:
x + 1 < y
->{x, 1, y}
- Parameters
cex (
cexpr_t
) – acexpr_t
object- Returns
a set of elements (the final_operands)
- Return type
set
-
class
decompiler_utils.
cImporter
¶ Collect import information
This is mainly to work around the fact that :func:
get_func_name
does not resolve imports…-
get_imports_info
()¶
-
-
class
decompiler_utils.
callObj
(c=None, name='', node=None, expr=None)¶ Auxiliary object for code clarity.
It represents the occurrence of a
call
expression.- Parameters
name (string, optional) – name of the function called
node (
controlFlowinator
) – acontrolFlowinator
node containing the call expressionexpr (
cexpr_t
) – thecall
expression element
-
decompiler_utils.
citem2higher
(citem)¶ This gets the higher representation of a given :class:
citem
, that is, a :class:cinsn_t
or :class:cexpr_t
- Parameters
citem (:class:
citem
) – a :class:citem
object
-
class
decompiler_utils.
controlFlowinator
(ea=None, fast=True)¶ This is the main object of FIDL’s API.
It finds all decompiled code “blocks” and recreates a CFG based on this information.
This gives us the best of both worlds: the possibility to analyze a graph (like in disassembly mode) and the power of :class:
citem
based analysis.Some analysis is performed after the CFG has been constructed. These are rather cost intensive, so they are turned off by default. Use
fast=False
to apply these and get a better CFG.- Parameters
ea (int) – address of the function to analyze
fast (bool) – Set to
False
for an object with richer information
-
dump_cfg
(out_dir)¶ Dump the CFG for debugging purposes
This dumps a representation of the CFG in DOT format. To generate an image:
dot.exe -Tpng decompiled.dot -o decompiled.png
-
dump_i_cfg
()¶ Dump interim CFG for debugging purposes
-
decompiler_utils.
create_comment
(c=None, ea=0, comment='')¶ Displays a comment at the line corresponding to
ea
TODO: avoid creating orphan comment in case the mapping from
ea
to decompiled code fails- Parameters
c (
controlFlowinator
) – acontrolFlowinator
objectea (int) – address for the comment
comment (string) – the comment to add
-
decompiler_utils.
debug_blownup_expressions
(c=None)¶ Debugging helper.
Show all blown up expressions for this function.
- Parameters
c (
controlFlowinator
) – acontrolFlowinator
object
-
decompiler_utils.
debug_get_break_statements
(c)¶
-
decompiler_utils.
debug_stahp
()¶ Toggles
DEBUG
value, useful for testing
-
decompiler_utils.
decast
(ins)¶ Remove the
cast
, returning the casted element
-
decompiler_utils.
display_all_calls_to
(func_name)¶ Wrapping
display_line_at()
since this is the most common use of this API- Parameters
func_name (string) – name of the function to search references
-
decompiler_utils.
display_line_at
(ea, silent=False)¶ Displays the line of pseudocode corresponding to
ea
This is useful to quickly answer questions like:
“Is this function always called with its first parameter being a constant?”
“I want to see all the error messages displayed by this function”
etc.
- Parameters
ea (int) – address of an element contained within the line to display
silent (bool) – flag controlling verbose output
-
decompiler_utils.
display_node
(c=None, node=None, color=None)¶ Displays a given node in the
pseudoviewer
- Parameters
c (
controlFlowinator
) – acontrolFlowinator
objectnode (
cexpr_t
) – acontrolFlowinator
nodecolor (int, optional) – color to mark the line of code corresponding to node
-
decompiler_utils.
display_path
(cf=None, path=None, color=None)¶ Shows a path’s code and colors its lines.
- Parameters
cf (an
cfunc_t
object, optional) – a decompilation objectpath (list) – a list of :
controlFlowinator
nodescolor (int, optional) – color to mark the lines of code corresponding to path
- Returns
a list of function lines (path nodes)
- Return type
list
-
decompiler_utils.
do_for_all_funcs
(func, fast=True, start_ea=None, blacklist=None, min_size=100, **kwargs)¶ This is a generic wrapper for all kinds of logic that we want to apply to all the functions in the binary.
- Parameters
func (function) – function “pointer” performing the analysis. Its only mandatory argument is a
controlFlowinator
object.fast (boolean, optional) – parameter fast for the
controlFlowinator
object.start_ea (int, optional) – Address to start looking for non-library functions.
blacklist (function, optional) – a function determining whether to process a function. Implemented via dependency injection.
- Returns
A list of JSON-like messages (individual function results)
- Return type
list
-
decompiler_utils.
does_constrain
(node)¶ This tries to answer the question: “Does this
node
constrains variables in any way?”Essentially it is looking for the occurrence of variables within known constrainer constructs, eg. inside an
if
condition.TODO: many more heuristics can be included here
- Parameters
node (
cinsn_t
orcexpr_t
) – typically acontrolFlowinator
node- Returns
a set of variable indexes (to
cf.lvars
array)- Return type
set
-
decompiler_utils.
dprint
(s='')¶ This will print a debug message only if debugging is active
- Parameters
s (str, optional) – The debug message
-
decompiler_utils.
dump_lvars
(ea=0)¶ Debugging helper.
-
decompiler_utils.
dump_pseudocode
(ea=0)¶ Debugging helper.
-
decompiler_utils.
find_all_calls_to
(f_name)¶ Finds all calls to a function with the given name
Note that the string comparison is relaxed to find variants of it, that is, searching for
malloc
will match as well_malloc
,malloc_0
, etc.- Parameters
f_name (string) – the function name to search for
- Returns
a list of
callObj
- Return type
list
-
decompiler_utils.
find_all_calls_to_within
(f_name, ea)¶ Finds all calls to a function with the given name within the function containing the
ea
address.Note that the string comparison is relaxed to find variants of it, that is, searching for
malloc
will match as well_malloc
,malloc_0
, etc.- Parameters
f_name (string) – the function name to search for
ea (int) – any address within the function that may contain the calls
- Returns
a list of
callObj
- Return type
list
-
decompiler_utils.
find_elements_of_type
(cex, element_type, elements=None)¶ Recursively extracts expression elements until a
cexpr_t
from a specific group is found- Parameters
cex (
cexpr_t
) – acexpr_t
objectelement_type (a
cot_xxx
value (eg.cot_add
)) – the type of element we are looking for (as acot_xxx
value, seecompiler_consts.py
)
- Returns
a set of
cexpr_t
of the specified type- Return type
set
-
decompiler_utils.
get_all_vars_in_node
(cex)¶ Extracts all variables involved in an expression.
- Parameters
cex (
cexpr_t
) – typically acontrolFlowinator
node- Returns
list of
var_t
indexes (tocf.lvars
)- Return type
list
-
decompiler_utils.
get_cfg_for_ea
(ea, dot_exe, out_dir)¶ Debugging helper.
Uses
DOT
to create a.PNG
graphic of theControlFlowinator
CFG and displays it.- Parameters
ea (int) – address of the function to analyze
dot_exe (string) – path to the
DOT
binaryout_dir (string) – directory to write the
.DOT
file
-
decompiler_utils.
get_cond_from_statement
(ins)¶ Given a
cinsn_t
representing a control flow structure (do, while, for, etc.), it returns the correspondingcexpr_t
representing the condition/argument for that code construct.This is useful since we usually want to peek into conditional statements…
- Parameters
ins (
cinsn_t
) – thecinsn_t
associated with a control flow structure- Returns
the condition or argument within that control flow structure
- Return type
cexpr_t
-
decompiler_utils.
get_function_vars
(c=None, ea=0, only_args=False, only_locals=False)¶ Populates a dict of
my_var_t
for the function containing the specifiedea
- Parameters
c (
controlFlowinator
) – acontrolFlowinator
object, optionalea (int) – the function address
only_args (bool, optional) – extract only function arguments
only_locals (bool, optional) – extract only local variables
- Returns
A dictionary of
my_var_t
, indexed by their index
-
decompiler_utils.
get_interesting_calls
(c, user_defined=[])¶ Not all functions are created equal. We are interested in functions with certain names or substrings in it.
- Parameters
c (
controlFlowinator
) – acontrolFlowinator
objectuser_defined (list, optional) – a list of names (or substrings), if not supplied a hard-coded default list will be used.
- Returns
a list of
callObj
- Return type
list
-
decompiler_utils.
get_return_type
(cf=None)¶ Hack to get the return value of a function.
- Parameters
cf (
ida_hexrays.cfuncptr_t
) – the result ofdecompile()
- Returns
Type information for the return value
- Return type
tinfo_t
-
decompiler_utils.
is_arithmetic_expression
(cex, only_these=[])¶ Checks whether this is an arithmetic expression.
- Parameters
cex (
cexpr_t
) – expression, usually this is a node.only_these (a list of
cot_*
constants, eg.cot_add
.) – a list of arithmetic expressions to look for. These are defined inida_hexrays
- Returns
True or False
- Return type
bool
-
decompiler_utils.
is_array_indexing
(ins)¶
-
decompiler_utils.
is_asg
(ins)¶
-
decompiler_utils.
is_binary_truncation
(cex)¶ Looking for expressions truncating a number
These expressions are of the form
v1 & 0xFFFF
or alike- Parameters
cex (:class:cexpr_t) – an expression
- Returns
True or False
- Return type
bool
-
decompiler_utils.
is_call
(ins)¶
-
decompiler_utils.
is_cast
(ins)¶
-
decompiler_utils.
is_final_expr
(cex)¶ Helper for internal functions.
A final expression will be defined as one that can not be further decomposed, eg. number, var, string, etc.
Normally, you should not need to use this.
- Parameters
cex (
cexpr_t
) – acexpr_t
object- Returns
True or False
- Return type
bool
-
decompiler_utils.
is_global_var
(ins)¶ Tells whether
ins
is a global variableTODO: enhance this heuristic
- Parameters
ins –
cexpr_t
orinsn_t
- Returns
True or False
- Return type
bool
-
decompiler_utils.
is_if
(ins)¶
-
decompiler_utils.
is_number
(ins)¶ Convenience wrapper
-
decompiler_utils.
is_ptr
(ins)¶
-
decompiler_utils.
is_read
(ins)¶ Try to find read primitives.
Looking for things like:
v3 = *(_DWORD *)(v5 + 784)
NOTE: this will find expressions that are read && write, since they are not mutually exclusive
TODO: Rather rough, it is a first version…
- Parameters
node (
cinsn_t
orcexpr_t
) – acontrolFlowinator
node- Returns
True or False
- Return type
bool
-
decompiler_utils.
is_ref
(ins)¶
-
decompiler_utils.
is_string
(ins)¶ Convenience wrapper
-
decompiler_utils.
is_var
(ins)¶ Whether this
ins
corresponds to a variableRemember that if this evaluates to True, we are dealing with an object of type
var_ref_t
which are pretty much useless. We may want to convert this to alvar_t
and even better to amy_var_t
afterwards.ref2var()
is a simple wrapper to perform the conversion between reference and variable
-
decompiler_utils.
is_write
(node)¶ Try to find write primitives.
Looking for things like:
*(_DWORD *)(something) = v38 arr[i] = v21
TODO: Rather rough, it is a first version…
- Parameters
node (
cinsn_t
orcexpr_t
) – acontrolFlowinator
node- Returns
True or False
- Return type
bool
-
decompiler_utils.
lex_citem_indexes
(line)¶ Part of Lighthouse plugin
Lex all ctree item indexes from a given line of text. The HexRays decompiler output contains invisible text tokens that can be used to attribute spans of text to the ctree items that produced them.
-
decompiler_utils.
lines_and_code
(cf=None, ea=0)¶ Mapping of line numbers and code
- Parameters
cf (an
cfunc_t
object, optional) – a decompilation objectea (int, optional) – Address within the function to decompile, if no cf is provided
- Returns
a dictionary of lines of code, indexed by line number
- Return type
dict
-
decompiler_utils.
main
()¶
-
decompiler_utils.
map_citem2line
(line2citem)¶ Part of Lighthouse plugin
Creates a mapping of citem indexes to lines of code
-
decompiler_utils.
map_line2citem
(decompilation_text)¶ Part of Lighthouse plugin
Map decompilation line numbers to citems. This function allows us to build a relationship between citems in the ctree and specific lines in the hexrays decompilation text.
-
decompiler_utils.
map_line2node
(cfunc, line2citem)¶ Part of Lighthouse plugin
Map decompilation line numbers to node (basic blocks) addresses. This function allows us to build a relationship between graph nodes (basic blocks) and specific lines in the hexrays decompilation text.
-
decompiler_utils.
map_node2lines
(line2node)¶ Part of Lighthouse plugin
Creates a mapping of nodes to lines of code
-
decompiler_utils.
my_decompile
(ea=None)¶ This sets flags necessary to use this programmatically.
- Parameters
ea (int) – Address within the function to decompile
- Returns
decompilation object
- Return type
a
cfunc_t
-
decompiler_utils.
my_get_func_name
(ea)¶ Wrapper for
get_func_name
handling some corner cases.- Parameters
ea (int) – Address of the function to resolve its name
-
class
decompiler_utils.
my_var_t
(var)¶ This wraps the
lvar_t
nicely into a more usable data structure.It aggregates several interesting pieces of information in one place. eg.
is_arg
,is_constrained
,is_initialized
, etc.The most commonly used attributes for this class are:
name
type_name
size
is_arg
is_pointer
is_array
is_signed
- Parameters
var (
lvar_t
) – an object representing a local variable or function argument
-
decompiler_utils.
num_value
(ins)¶ Returns the numerical value of
ins
- Parameters
ins –
cexpr_t
orinsn_t
-
decompiler_utils.
points_to
(ins)¶
-
class
decompiler_utils.
pseudoViewer
¶ This wraps the
pseudoViewer
API neatly.We need it because some things don’t work unless you previously visited (or are currently visiting) the function whose decompiled form you want to analyze. Thus, we are forced to “Hack like in the movies”
TODO: probably deprecate this after IDA 7.5 changes NOTE: the performance penalty is negligible
-
close
()¶ Closes the pseudoviewer widget
-
show
(ea=0, flags=8)¶ Displays the pseudoviewer widget
- Parameters
ea (int, optional) – adress of the function to display
flags (int, optional) – how to flags an existing pseudocode display, if any
-
silent_flags
= 8¶
-
-
decompiler_utils.
ref2var
(ref, c=None, cf=None)¶ Convenient wrapper to streamline the conversions between
var_ref_t
andlvar_t
- Parameters
c (
controlFlowinator
) – acontrolFlowinator
object, optionalcf (a
cfunc_t
object) – a decompilation object (usually the result ofdecompile
), optionalref (
var_ref_t
) – a reference to a variable in the pseudocode
- Returns
a
lvar_t
object- Return type
lvar_t
-
decompiler_utils.
ref_to
(ins)¶
-
decompiler_utils.
string_value
(ins)¶ Gets the string corresponding to
ins
Works with C-str and Unicode
- Parameters
ins –
cexpr_t
orinsn_t
- Returns
string for this
ins
- Return type
string
-
decompiler_utils.
value_of_global
(ins)¶ Returns the value of a global variable