Skip to content
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
109 commits
Select commit Hold shift + click to select a range
43fddb3
Skeleton code for multi-block lifting.
ltfish Jun 10, 2025
f5f503d
Update version to 9.2.161.dev0 [ci skip]
github-actions[bot] Jun 10, 2025
962c59c
VRA: Use function graph for dominance frontier if available. (#5514)
ltfish Jun 11, 2025
9de92a3
Phoenix: Remove a memo-based hack in short-circuit matching. (#5521)
ltfish Jun 12, 2025
c968277
Fix store and extract for SimTypeWideChar. (#5522)
ltfish Jun 12, 2025
f556f23
BlockSimplifier: Fix incorrect removal of call statements. (#5523)
ltfish Jun 12, 2025
ceb11bb
Project: Query kernel32.dll for APIs in api-ms-win-xxx.dll. (#5524)
ltfish Jun 12, 2025
0d3f6c8
Improve handling of phi var-involving constant propagation. (#5525)
ltfish Jun 12, 2025
0cb9a06
VRA: Limit stack variable stores to 256 bytes. (#5526)
ltfish Jun 12, 2025
36ad2f8
Peephole: Signed division simplification case B. (#5530)
ltfish Jun 13, 2025
26244a7
Fix the check on guard that incorrectly checks the False condition (#…
julienlancia Jun 13, 2025
9047b92
Add Emulator and ConcreteEngine; Adapt IcicleEngine to ConcreteEngine…
twizmwazin Jun 13, 2025
bb9fd94
Update version to 9.2.162.dev0 [ci skip]
github-actions[bot] Jun 17, 2025
fcd04fc
SPropagator: Propagate stackvars that are not eliminatable. (#5532)
ltfish Jun 18, 2025
854f0f5
CFGBase: Recognize int 3 as function alignments. (#5536)
ltfish Jun 18, 2025
e327f18
utils: Drop unused arguments to get_cpp_function_name
mborgerson Jun 20, 2025
4c887b3
utils: Type annotate get_cpp_function_name
mborgerson Jun 20, 2025
df2b3ff
utils: Use parse_cpp_file in get_cpp_function_name for better parsing
mborgerson Jun 20, 2025
278a597
utils: Add a docstring to get_cpp_function_name
mborgerson Jun 20, 2025
d332e11
utils: Add a test for get_cpp_function_name
mborgerson Jun 20, 2025
3d44d3b
SAILR: Fix indentation (and the heuristics). (#5543)
ltfish Jun 20, 2025
bd741e9
GraphRegion: Use new sub graph regions when replacing subgraphs. (#5544)
ltfish Jun 20, 2025
30761d8
Fix the implementation of to_acyclic_graph. (#5539)
ltfish Jun 20, 2025
815614f
SAILR: Limit graph and edge scale for heuristic 2. (#5547)
ltfish Jun 20, 2025
253d6d0
LoweredSwitchSimp: Do not test graph structurability. (#5540)
ltfish Jun 20, 2025
00a1044
Phoenix: Speed up negating condition checks. (#5546)
ltfish Jun 20, 2025
d5fd0b5
Phoenix: Speed up acyclic graph conversion. (#5548)
ltfish Jun 21, 2025
13f037a
ReturnDuplicatorLow: Set a lower max_func_blocks limit. (#5549)
ltfish Jun 21, 2025
0ed5486
ci: bump actions-rust-lang/setup-rust-toolchain from 1.12.0 to 1.13.0…
dependabot[bot] Jun 23, 2025
79b3345
ci: bump astral-sh/setup-uv from 6.1.0 to 6.3.0 (#5552)
dependabot[bot] Jun 23, 2025
d70d76c
Merging sub-function definitions into parent all_definitions (#5550)
thelastede Jun 23, 2025
b5aaa47
[pre-commit.ci] pre-commit autoupdate (#5551)
pre-commit-ci[bot] Jun 23, 2025
48600f4
Release GIL while icicle VM is running (#5555)
twizmwazin Jun 24, 2025
448a82c
Decompilation: Fix new bogus variables during subsequent decompilatio…
ltfish Jun 24, 2025
4172bc7
Update version to 9.2.163.dev0 [ci skip]
github-actions[bot] Jun 24, 2025
afce61a
Replace nasm with keystone through load_shellcode in smc test (#5556)
IParsons1000 Jun 25, 2025
d6ff116
Fix test_smc.py (#5559)
rhelmot Jun 26, 2025
e8259ac
Add rust-toolchain.toml (#5560)
twizmwazin Jun 26, 2025
4bf88cf
FCP: Check if node is in graph before getting its successors. (#5557)
ltfish Jun 27, 2025
fe58bc4
Update to Rust 1.88 (#5561)
twizmwazin Jun 27, 2025
bcdd147
test_decompiling_abnormal_switch_case_case3: Print the actual base ty…
ltfish Jun 27, 2025
d56c5fa
FLIRT: Catch decompression errors; end when function name bytes >= 0x…
ltfish Jun 27, 2025
41f7fb5
Decompilation: Fix type renaming, struct field renaming and retyping.…
ltfish Jun 27, 2025
daa5148
[pre-commit.ci] pre-commit autoupdate (#5565)
pre-commit-ci[bot] Jun 30, 2025
54ef8bc
ci: bump astral-sh/setup-uv from 6.3.0 to 6.3.1 (#5566)
dependabot[bot] Jun 30, 2025
5bd600e
CFGBase: Fix an O(N^2) list scanning in bad function detection. (#5569)
ltfish Jul 1, 2025
f5d61e2
fix: Some inconsitencies with RDA function handlers (#5568)
Cl4sm Jul 1, 2025
52c4533
Update version to 9.2.164.dev0 [ci skip]
github-actions[bot] Jul 1, 2025
d52ef48
CFGBase: Lift fewer blocks when finding bad function starts. (#5570)
ltfish Jul 1, 2025
adc65ae
Add block tracing capability to Icicle engine (#5572)
twizmwazin Jul 1, 2025
2018317
ConstResolver: More caching and filtering for fewer FCP calls. (#5571)
ltfish Jul 1, 2025
2835b53
Three fixes in region identifier and phoenix for bizarre switch-cases…
ltfish Jul 2, 2025
7d85aa1
Struct.size: Do not raise exceptions if the last field is a BOT. (#5575)
ltfish Jul 2, 2025
79c2ac4
AIL: Fix CAS conversion for oldHi. (#5576)
ltfish Jul 2, 2025
68191c0
to_acyclic_graph: Fix an off-by-one error when removing edges. (#5577)
ltfish Jul 2, 2025
20295af
Try pytest marker (#5578)
twizmwazin Jul 2, 2025
b9bf666
GraphRegion: Fix an issue causing duplicated GraphRegions. (#5579)
ltfish Jul 4, 2025
f6256fc
Fix missing type check in _find_node_in_graph. (#5580)
ltfish Jul 4, 2025
02f352c
Phoenix: Do not make a switch if node_a is region head. (#5581)
ltfish Jul 4, 2025
ffca3ac
is_alignment_mask: Add 0xffffffe0. (#5583)
ltfish Jul 4, 2025
ecd5a3d
SimCppClass: Give opaque cpp classes a default size. (#5582)
ltfish Jul 4, 2025
f23c264
CallsiteMaker: Expand struct arg locs recursively. (#5584)
ltfish Jul 4, 2025
92a2059
Phoenix: Fix natural loop creation logic in subgraphs. (#5585)
ltfish Jul 5, 2025
4c79ec6
CFGEmulated: Catch AngrSyscallError when getting syscall procedures. …
ltfish Jul 6, 2025
5e3a736
AILSimplifier: Fix stmt ID mismatches in calls/assignments_to_remove.…
ltfish Jul 6, 2025
b462e27
CFGFast: Reduce segment list fragmentation for ARMCortexM binaries. (…
ltfish Jul 7, 2025
a3d7e95
[pre-commit.ci] pre-commit autoupdate (#5589)
pre-commit-ci[bot] Jul 7, 2025
f22f129
AILSimplifier: Narrow vector registers. (#5537)
ltfish Jul 8, 2025
3c6db20
CFGFast: Include more analysis details in the progressbar. (#5590)
ltfish Jul 8, 2025
28128ba
Update version to 9.2.165.dev0 [ci skip]
github-actions[bot] Jul 8, 2025
98d1a22
Do not lazy-import sympy. Fix #5591. (#5592)
ltfish Jul 9, 2025
5e8d2ea
Peephole: Add inlined memcpy recognition. (#5535)
ltfish Jul 9, 2025
3d4f466
Add AFL-style edge hitmap support to Icicle engine (#5593)
twizmwazin Jul 9, 2025
849c674
InlinedStrcpy: Support more types of statements. (#5594)
ltfish Jul 9, 2025
6cd7ab7
Enhance Type 3 string obfuscation finder. (#5533)
ltfish Jul 9, 2025
0024053
added multi_lift prototipe
rodriguezzfran Jul 9, 2025
2c60c46
merge master into this
BrunoEcl Jul 10, 2025
4cb8273
VEXLifter refactor for lift_vex_multi
BrunoEcl Jul 25, 2025
7acefa3
return list of blocks
BrunoGugli Jul 28, 2025
9d3a55d
yapf
BrunoEcl Jul 28, 2025
0b7d24f
change
BrunoEcl Jul 28, 2025
728810e
arguments refactor
BrunoEcl Jul 29, 2025
25fb188
Merge branch 'master' into feat/multiblock_lifting
BrunoGugli Aug 23, 2025
449aa0f
Merge branch 'feat/multiblock_lifting' of github.com:rodriguezzfran/a…
BrunoGugli Aug 23, 2025
7404aee
use huge size to load program memory
BrunoGugli Aug 23, 2025
55f2d4b
first experiment
BrunoGugli Aug 24, 2025
987304a
python fixes
BrunoGugli Aug 24, 2025
3b3d3e3
comments
rodriguezzfran Aug 26, 2025
1305a02
Merge branch 'master' into feat/multiblock_lifting
BrunoEcl Aug 26, 2025
9561dd2
comments
BrunoEcl Aug 27, 2025
3d31661
added vex_lift_multi call
Aug 28, 2025
76aa204
deleted debug comments
Sep 1, 2025
a950b2c
more debug
Sep 2, 2025
8e83ee2
Merge branch 'master' into feat/multiblock_lifting
rodriguezzfran Sep 22, 2025
9142bba
added point to cgf nodes
rodriguezzfran Oct 6, 2025
ca40dc8
testing multi block list
BrunoEcl Oct 8, 2025
8a263a3
Merge branch 'feat/multiblock_lifting' of github.com:rodriguezzfran/a…
BrunoEcl Oct 8, 2025
3f0d6d0
Merge branch 'master' into feat/multiblock_lifting
Oct 8, 2025
6aebd4d
creating generate_cfgnodes
BrunoEcl Oct 11, 2025
5a84e55
generate cfgnodeS
BrunoEcl Oct 12, 2025
1a3396b
removed GenerateCFGNodeResult type and added get_irsb method on block.py
BrunoEcl Oct 13, 2025
c0b5bd9
scan irsb multi
BrunoEcl Oct 26, 2025
0bd80fa
merge master into this
BrunoEcl Oct 30, 2025
3861756
lifting a truncated list
BrunoGugli Nov 3, 2025
7e480b9
Fish's HACK: Use lift_multi() as a caching mechanism.
ltfish Dec 4, 2025
b94da41
debugging cache
BrunoGugli Dec 29, 2025
793a193
cache debugged
BrunoGugli Dec 29, 2025
3484854
remove fish's hack
BrunoGugli Dec 29, 2025
a789727
remove block from cache on decoding errors
BrunoGugli Dec 31, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Phoenix: Speed up acyclic graph conversion. (angr#5548)
* Phoenix: Speed up acyclic graph conversion.

Also switches to deterministic post-ordering when possible for improved
speed.

* Add a missing node order update.

* Lint code.

* Properly handle multi-headed acyclic graphs.

* Fix another place where node_order is not updated when new nodes are created.

* Fix one more missed place.

* Add a missed list(reversed(...)).

* Adjust the tr.build_spec_list test case.
  • Loading branch information
ltfish authored and rodriguezzfran committed Jul 9, 2025
commit d5fd0b5475eae99b0e327f38c8d1924c771cd07e
69 changes: 63 additions & 6 deletions angr/analyses/decompiler/structuring/phoenix.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,12 @@ def __init__(
self._improve_algorithm = improve_algorithm
self._edge_virtualization_hints = []

# node_order keeps a dictionary of nodes and their order in a quasi-topological sort of the region full graph
# (graph_with_successors). _generate_node_order() initializes this dictionary. we then update this dictionary
# when new nodes are created. we do not populate this dictionary when working on acyclic graphs because it's
# not used for acyclic graphs.
self._node_order: dict[Any, int] | None = None

self._use_multistmtexprs = use_multistmtexprs
self._multistmtexpr_stmt_threshold = multistmtexpr_stmt_threshold
self._analyze()
Expand Down Expand Up @@ -221,8 +227,11 @@ def _analyze(self):

def _analyze_cyclic(self) -> bool:
any_matches = False
acyclic_graph = to_acyclic_graph(self._region.graph, loop_heads=[self._region.head])
for node in list(reversed(GraphUtils.quasi_topological_sort_nodes(acyclic_graph))):

if self._node_order is None:
self._generate_node_order()
acyclic_graph = to_acyclic_graph(self._region.graph, node_order=self._node_order)
for node in list(GraphUtils.dfs_postorder_nodes_deterministic(acyclic_graph, self._region.head)):
if node not in self._region.graph:
continue
matched = self._match_cyclic_schemas(
Expand Down Expand Up @@ -499,6 +508,10 @@ def _match_cyclic_while_with_single_successor(
self.replace_nodes(full_graph, node, loop_node, self_loop=False)
full_graph.add_edge(loop_node, successor_node)

if self._node_order is not None:
self._node_order[loop_node] = self._node_order[node]
self._node_order[successor_node] = self._node_order[loop_node]

return True, loop_node, successor_node

def _cyclic_while_with_single_successor_must_return(self, successor_node: SequenceNode) -> bool:
Expand Down Expand Up @@ -1035,7 +1048,7 @@ def _match_acyclic_schemas(self, graph: networkx.DiGraph, full_graph: networkx.D

self._assert_graph_ok(acyclic_graph, "Removed wrong edges")

for node in list(reversed(GraphUtils.quasi_topological_sort_nodes(acyclic_graph))):
for node in list(GraphUtils.dfs_postorder_nodes_deterministic(acyclic_graph, head)):
if node not in graph:
continue
if graph.has_edge(node, head):
Expand Down Expand Up @@ -1141,6 +1154,8 @@ def _match_acyclic_switch_cases_incomplete_switch_head(self, node, graph, full_g
node_default = Block(SWITCH_MISSING_DEFAULT_NODE_ADDR, 0, statements=[jmp_to_default_node])
graph.add_edge(node, node_default)
full_graph.add_edge(node, node_default)
if self._node_order is not None:
self._node_order[node_default] = self._node_order[node]
r = self._make_switch_cases_core(
node,
self.cond_proc.claripy_ast_from_ail_condition(last_stmt.switch_variable),
Expand Down Expand Up @@ -1246,6 +1261,8 @@ def _match_acyclic_switch_cases_address_loaded_from_memory(self, node, graph, fu
# un-structure IncompleteSwitchCaseNode
if isinstance(node_a, SequenceNode) and node_a.nodes and isinstance(node_a.nodes[0], IncompleteSwitchCaseNode):
_, new_seq_node = self._unpack_sequencenode_head(graph, node_a)
if new_seq_node is not None and self._node_order is not None:
self._node_order[new_seq_node] = self._node_order[node_a]
self._unpack_sequencenode_head(full_graph, node_a, new_seq=new_seq_node)
# update node_a
node_a = next(iter(nn for nn in graph.nodes if nn.addr == target))
Expand All @@ -1256,6 +1273,8 @@ def _match_acyclic_switch_cases_address_loaded_from_memory(self, node, graph, fu
self._unpack_incompleteswitchcasenode(full_graph, node_a) # this shall not fail
# update node_a
node_a = next(iter(nn for nn in graph.nodes if nn.addr == target))
if self._node_order is not None:
self._generate_node_order()

better_node_a = node_a
if isinstance(node_a, SequenceNode) and is_empty_or_label_only_node(node_a.nodes[0]) and len(node_a.nodes) == 2:
Expand Down Expand Up @@ -1600,6 +1619,8 @@ def _match_acyclic_incomplete_switch_cases(
self.replace_nodes(full_graph, node, new_node)
if out_nodes:
full_graph.add_edge(new_node, out_nodes[0])
if self._node_order:
self._node_order[new_node] = self._node_order[node]
return True
return False

Expand Down Expand Up @@ -1781,6 +1802,8 @@ def _make_switch_cases_core(

graph.add_edge(head, scnode)
full_graph.add_edge(head, scnode)
if self._node_order is not None:
self._node_order[scnode] = self._node_order[head]

if out_edges:
# for all out edges going to head, we ensure there is a goto at the end of each corresponding case node
Expand Down Expand Up @@ -2511,7 +2534,7 @@ def _last_resort_refinement(self, head, graph: networkx.DiGraph, full_graph: net
if networkx.is_directed_acyclic_graph(full_graph):
acyclic_graph = full_graph
else:
acyclic_graph = to_acyclic_graph(full_graph, loop_heads=[head])
acyclic_graph = to_acyclic_graph(full_graph, node_order=self._node_order)
for src, dst in acyclic_graph.edges:
if src is dst:
continue
Expand All @@ -2527,7 +2550,20 @@ def _last_resort_refinement(self, head, graph: networkx.DiGraph, full_graph: net
if (src.addr, dst.addr) not in self.whitelist_edges:
other_edges.append((src, dst))

ordered_nodes = GraphUtils.quasi_topological_sort_nodes(acyclic_graph, loop_heads=[head])
# acyclic graph may contain more than one entry node, so we may add a temporary head node to ensure all nodes
# are accounted for in node_seq
graph_entries = [nn for nn in acyclic_graph if acyclic_graph.in_degree[nn] == 0]
postorder_head = head
if len(graph_entries) > 1:
postorder_head = Block(0, 0)
for nn in graph_entries:
acyclic_graph.add_edge(postorder_head, nn)
ordered_nodes = list(
reversed(list(GraphUtils.dfs_postorder_nodes_deterministic(acyclic_graph, postorder_head)))
)
if len(graph_entries) > 1:
ordered_nodes.remove(postorder_head)
acyclic_graph.remove_node(postorder_head)
node_seq = {nn: (len(ordered_nodes) - idx) for (idx, nn) in enumerate(ordered_nodes)} # post-order

if all_edges_wo_dominance:
Expand Down Expand Up @@ -2607,6 +2643,8 @@ def _virtualize_edge(self, graph, full_graph, src, dst):
self.virtualized_edges.add((src, dst))
if new_src is not None:
self.replace_nodes(graph, src, new_src)
if self._node_order is not None:
self._node_order[new_src] = self._node_order[src]
if full_graph is not None:
self.virtualized_edges.add((src, dst))
full_graph.remove_edge(src, dst)
Expand Down Expand Up @@ -2916,10 +2954,29 @@ def _sort_edge(edge_):
src, dst = edge_
dst_in_degree = graph.in_degree[dst]
src_out_degree = graph.out_degree[src]
return -node_seq.get(dst), dst_in_degree, src_out_degree, -src.addr, -dst.addr # type: ignore
return -node_seq[dst], dst_in_degree, src_out_degree, -src.addr, -dst.addr # type: ignore

return sorted(edges, key=_sort_edge, reverse=True)

def _generate_node_order(self):
the_graph = (
self._region.graph_with_successors if self._region.graph_with_successors is not None else self._region.graph
)
the_head = self._region.head
ordered_nodes = GraphUtils.quasi_topological_sort_nodes(
the_graph,
loop_heads=[the_head],
)
self._node_order = {n: i for i, n in enumerate(ordered_nodes)}

def replace_nodes(self, graph, old_node_0, new_node, old_node_1=None, self_loop=True):
super().replace_nodes(graph, old_node_0, new_node, old_node_1=old_node_1, self_loop=self_loop)
if self._node_order is not None and graph is self._region.graph_with_successors:
if old_node_1 is not None:
self._node_order[new_node] = min(self._node_order[old_node_0], self._node_order[old_node_1])
else:
self._node_order[new_node] = self._node_order[old_node_0]

@staticmethod
def _replace_node_in_edge_list(edge_list: list[tuple], old_node, new_node) -> None:
for idx in range(len(edge_list)): # pylint:disable=consider-using-enumerate
Expand Down
5 changes: 2 additions & 3 deletions angr/analyses/decompiler/structuring/structurer_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@

import networkx

import angr.ailment as ailment
import claripy

from angr import ailment
from angr.analyses import Analysis
from angr.analyses.decompiler.condition_processor import ConditionProcessor
from angr.analyses.decompiler.sequence_walker import SequenceWalker
Expand Down Expand Up @@ -971,8 +971,7 @@ def _update_new_sequences(self, removed_sequences: set[SequenceNode], replaced_s
new_sequences.append(new_seq_)
self._new_sequences = new_sequences

@staticmethod
def replace_nodes(graph, old_node_0, new_node, old_node_1=None, self_loop=True):
def replace_nodes(self, graph, old_node_0, new_node, old_node_1=None, self_loop=True): # pylint:disable=no-self-use
in_edges = list(graph.in_edges(old_node_0, data=True))
out_edges = list(graph.out_edges(old_node_0, data=True))
if old_node_1 is not None:
Expand Down
15 changes: 15 additions & 0 deletions angr/utils/graph.py
Original file line number Diff line number Diff line change
Expand Up @@ -673,6 +673,21 @@ def find_widening_points(function_addr, function_endpoints, graph): # pylint: d

return list(widening_addrs)

@staticmethod
def dfs_postorder_nodes_deterministic(graph: networkx.DiGraph, source):
visited = set()
stack = [source]
while stack:
node = stack[-1]
if node not in visited:
visited.add(node)
for succ in sorted(graph.successors(node), key=GraphUtils._sort_node):
if succ not in visited:
stack.append(succ)
else:
yield node
stack.pop()

@staticmethod
def reverse_post_order_sort_nodes(graph, nodes=None):
"""
Expand Down
10 changes: 4 additions & 6 deletions tests/analyses/decompiler/test_decompiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -2275,11 +2275,7 @@ def test_decompiling_tr_build_spec_list(self, decompiler_options=None):
)
self._print_decompilation_result(d)

assert d.codegen.text.count("goto ") == 3
# `LABEL_400d08` is the label `try_bracketed_repeat` found in the source, which is jumped to twice
assert d.codegen.text.count("goto LABEL_400d08;") == 2
# this goto may go away in the future if the loops are structured correctly
assert d.codegen.text.count("goto LABEL_400d2a;") == 1
assert d.codegen.text.count("goto") == 0

@structuring_algo("sailr")
def test_decompiling_sha384sum_digest_bsd_split_3(self, decompiler_options=None):
Expand Down Expand Up @@ -3531,7 +3527,7 @@ def test_decompiling_tr_O2_parse_str(self, decompiler_options=None):

f = proj.kb.functions["parse_str"]
proj.analyses.CompleteCallingConventions(cfg=cfg, recover_variables=True, analyze_callsites=True)
d = proj.analyses[Decompiler](f, cfg=cfg.model, options=decompiler_options)
d = proj.analyses[Decompiler].prep(fail_fast=True)(f, cfg=cfg.model, options=decompiler_options)
self._print_decompilation_result(d)

line_count = d.codegen.text.count("\n")
Expand Down Expand Up @@ -3584,6 +3580,8 @@ def test_dd_iread_ret_dup_region(self, decompiler_options=None):
proj.analyses.CompleteCallingConventions(cfg=cfg, recover_variables=True)
f = proj.kb.functions["iread"]
d = proj.analyses[Decompiler].prep(fail_fast=True)(f, cfg=cfg.model, options=decompiler_options)
assert d.codegen is not None and d.codegen.text is not None
self._print_decompilation_result(d)
text = d.codegen.text

assert "{\n}" not in text
Expand Down