Skip to content

Commit

Permalink
Some preparation for function calls. (FuelLabs#2471)
Browse files Browse the repository at this point in the history
* Stop recreating multiple copies of the same function when converting to IR.

* Improve the `inline` IR pass.

Instead of inlining every function always, it is now possible to use a
heuristic predicate to decide whether a function should be inlined.

* Fix a bug in the `constcombine` IR pass.
  • Loading branch information
otrho authored Aug 9, 2022
1 parent a23accd commit 1f675c1
Show file tree
Hide file tree
Showing 16 changed files with 702 additions and 87 deletions.
93 changes: 56 additions & 37 deletions sway-core/src/ir_generation/function.rs
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ pub(super) struct FnCompiler {
pub(super) block_to_break_to: Option<Block>,
pub(super) block_to_continue_to: Option<Block>,
lexical_map: LexicalMap,
recreated_fns: HashMap<(Span, Vec<TypeId>, Vec<TypeId>), Function>,
}

pub(super) enum StateAccessType {
Expand All @@ -55,6 +56,7 @@ impl FnCompiler {
block_to_break_to: None,
block_to_continue_to: None,
lexical_map,
recreated_fns: HashMap::new(),
}
}

Expand Down Expand Up @@ -787,46 +789,63 @@ impl FnCompiler {
self_state_idx: Option<StateIndex>,
span_md_idx: Option<MetadataIndex>,
) -> Result<Value, CompileError> {
// XXX OK, now, the old compiler inlines everything very lazily. Function calls include
// the body of the callee (i.e., the callee_body arg above) and so codegen just pulled it
// straight in, no questions asked. Library functions are provided in an initial namespace
// from Forc and when the parser builds the AST (or is it during type checking?) these
// function bodies are embedded.
// The compiler inlines everything very lazily. Function calls include the body of the
// callee (i.e., the callee_body arg above). Library functions are provided in an initial
// namespace from Forc and when the parser builds the AST (or is it during type checking?)
// these function bodies are embedded.
//
// We're going to build little single-use instantiations of the callee and then call them.
// For now if they're called in multiple places they'll be redundantly recreated, but also
// at present we are still inlining everything so it actually makes little difference.
// Here we build little single-use instantiations of the callee and then call them. Naming
// is not yet absolute so we must ensure the function names are unique.
//
// Eventually we need to Do It Properly and inline only when necessary, and compile the
// standard library to an actual module.

{
let callee_name = format!("{}_{}", callee.name, context.get_unique_id());

let mut callee_fn_decl = callee;
callee_fn_decl.type_parameters.clear();
callee_fn_decl.name = Ident::new(Span::from_string(callee_name));

let callee = compile_function(context, md_mgr, self.module, callee_fn_decl)?;
// Eventually we need to Do It Properly and inline into the AST only when necessary, and
// compile the standard library to an actual module.

// Get the callee from the cache if we've already compiled it. We can't insert it with
// .entry() since `compile_function()` returns a Result we need to handle. The key to our
// cache, to uniquely identify a function instance, is the span and the type IDs of any
// args and type parameters. It's using the Sway types rather than IR types, which would
// be more accurate but also more fiddly.
let fn_key = (
callee.span(),
callee.parameters.iter().map(|p| p.type_id).collect(),
callee.type_parameters.iter().map(|tp| tp.type_id).collect(),
);
let callee = match self.recreated_fns.get(&fn_key).copied() {
Some(func) => func,
None => {
let callee_fn_decl = TypedFunctionDeclaration {
type_parameters: Vec::new(),
name: Ident::new(Span::from_string(format!(
"{}_{}",
callee.name,
context.get_unique_id()
))),
..callee
};
let new_func =
compile_function(context, md_mgr, self.module, callee_fn_decl)?.unwrap();
self.recreated_fns.insert(fn_key, new_func);
new_func
}
};

// Now actually call the new function.
let args = ast_args
.into_iter()
.map(|(_, expr)| self.compile_expression(context, md_mgr, expr))
.collect::<Result<Vec<Value>, CompileError>>()?;
let state_idx_md_idx = match self_state_idx {
Some(self_state_idx) => {
md_mgr.storage_key_to_md(context, self_state_idx.to_usize() as u64)
}
None => None,
};
Ok(self
.current_block
.ins(context)
.call(callee.unwrap(), &args)
.add_metadatum(context, span_md_idx)
.add_metadatum(context, state_idx_md_idx))
}
// Now actually call the new function.
let args = ast_args
.into_iter()
.map(|(_, expr)| self.compile_expression(context, md_mgr, expr))
.collect::<Result<Vec<Value>, CompileError>>()?;
let state_idx_md_idx = match self_state_idx {
Some(self_state_idx) => {
md_mgr.storage_key_to_md(context, self_state_idx.to_usize() as u64)
}
None => None,
};
Ok(self
.current_block
.ins(context)
.call(callee, &args)
.add_metadatum(context, span_md_idx)
.add_metadatum(context, state_idx_md_idx))
}

fn compile_if(
Expand Down
12 changes: 12 additions & 0 deletions sway-ir/src/function.rs
Original file line number Diff line number Diff line change
Expand Up @@ -205,6 +205,18 @@ impl Function {
idx
}

/// Return the number of blocks in this function.
pub fn num_blocks(&self, context: &Context) -> usize {
context.functions[self.0].blocks.len()
}

/// Return the number of instructions in this function.
pub fn num_instructions(&self, context: &Context) -> usize {
self.block_iter(context)
.map(|block| block.num_instructions(context))
.sum()
}

/// Return the function name.
pub fn get_name<'a>(&self, context: &'a Context) -> &'a str {
&context.functions[self.0].name
Expand Down
29 changes: 9 additions & 20 deletions sway-ir/src/optimize/constants.rs
Original file line number Diff line number Diff line change
Expand Up @@ -62,19 +62,18 @@ fn combine_const_insert_values(context: &mut Context, function: &Function) -> bo
});

if let Some((block, ins_val, aggregate, const_val, indices)) = candidate {
// OK, here we have an `insert_value` of a constant directly into a constant
// aggregate. We want to replace the constant aggregate with an updated one.
let new_aggregate =
combine_const_aggregate_field(context, function, aggregate, const_val, &indices);
// OK, here we have an `insert_value` of a constant directly into a constant aggregate. We
// want to replace the constant aggregate with an updated one.
let new_aggregate = combine_const_aggregate_field(context, aggregate, const_val, &indices);

// Replace uses of the `insert_value` instruction with the new aggregate.
function.replace_value(context, ins_val, new_aggregate, None);

// Remove the `insert_value` instruction.
block.remove_instruction(context, ins_val);

// Let's return now, since our iterator may get confused and let the pass
// iterate further itself.
// Let's return now, since our iterator may get confused and let the pass iterate further
// itself.
return true;
}

Expand All @@ -83,7 +82,6 @@ fn combine_const_insert_values(context: &mut Context, function: &Function) -> bo

fn combine_const_aggregate_field(
context: &mut Context,
function: &Function,
aggregate: Value,
const_value: Value,
indices: &[u64],
Expand All @@ -108,20 +106,11 @@ fn combine_const_aggregate_field(
// Update the new aggregate with the constant field, based in the indices.
inject_constant_into_aggregate(&mut new_aggregate, const_value, indices);

// Replace the old aggregate with the new aggregate.
let new_aggregate_value =
Value::new_constant(context, new_aggregate).add_metadatum(context, metadata);
function.replace_value(context, aggregate, new_aggregate_value, None);
// NOTE: Previous versions of this pass were trying to clean up after themselves, by replacing
// the old aggregate with this new one, and/or removing the old aggregate altogether. This is
// too dangerous without proper checking for remaining uses, and is best left to DCE anyway.

// Remove the old aggregate from the context.
//
// OR NOT! This is too dangerous unless we can
// guarantee it has no uses, which is something we should implement eventually. For now, in
// this case it shouldn't matter if we leave it, even if it's not used.
//
// TODO: context.values.remove(aggregate.0);

new_aggregate_value
Value::new_constant(context, new_aggregate).add_metadatum(context, metadata)
}

fn inject_constant_into_aggregate(aggregate: &mut Constant, value: Constant, indices: &[u64]) {
Expand Down
85 changes: 81 additions & 4 deletions sway-ir/src/optimize/inline.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ use crate::{
error::IrError,
function::Function,
instruction::Instruction,
irtype::Type,
metadata::{combine, MetadataIndex},
pointer::Pointer,
value::{Value, ValueContent, ValueDatum},
Expand All @@ -20,21 +21,43 @@ use crate::{
///
/// e.g., If this is applied to main() then all calls in the program are removed. This is
/// obviously dangerous for recursive functions, in which case this pass would inline forever.
pub fn inline_all_function_calls(
context: &mut Context,
function: &Function,
) -> Result<bool, IrError> {
inline_some_function_calls(context, function, |_, _, _| true)
}

/// Inline function calls based on a provided heuristic predicate.
///
/// There are many things to consider when deciding to inline a function. For example:
/// - The size of the function, especially if smaller than the call overhead size.
/// - The stack frame size of the function.
/// - The number of calls made to the function or if the function is called inside a loop.
/// - A particular call has constant arguments implying further constant folding.
/// - An attribute request, e.g., #[always_inline], #[never_inline].
pub fn inline_some_function_calls<F: Fn(&Context, &Function, &Value) -> bool>(
context: &mut Context,
function: &Function,
predicate: F,
) -> Result<bool, IrError> {
let mut modified = false;
loop {
// Find the next call site.
// Find the next call site which passes the predicate.
let call_data = function
.instruction_iter(context)
.find_map(|(block, call_val)| match context.values[call_val.0].value {
ValueDatum::Instruction(Instruction::Call(inlined_function, _)) => {
Some((block, call_val, inlined_function))
}
ValueDatum::Instruction(Instruction::Call(inlined_function, _)) => predicate(
context,
&inlined_function,
&call_val,
)
.then_some((block, call_val, inlined_function)),
_ => None,
});

match call_data {
Some((block, call_val, inlined_function)) => {
inline_function_call(context, *function, block, call_val, inlined_function)?;
Expand All @@ -46,10 +69,64 @@ pub fn inline_all_function_calls(
Ok(modified)
}

/// A utility to get a predicate which can be passed to inline_some_function_calls() based on
/// certain sizes of the function. If a constraint is None then any size is assumed to be
/// acceptable.
///
/// The max_stack_size is a bit tricky, as the IR doesn't really know (or care) about the size of
/// types. See the source code for how it works.
pub fn is_small_fn(
max_blocks: Option<usize>,
max_instrs: Option<usize>,
max_stack_size: Option<usize>,
) -> impl Fn(&Context, &Function, &Value) -> bool {
fn count_type_elements(context: &Context, ty: &Type) -> usize {
// This is meant to just be a heuristic rather than be super accurate.
match ty {
Type::Unit | Type::Bool | Type::Uint(_) | Type::B256 | Type::String(_) => 1,
Type::Array(aggregate) => {
let (ty, sz) = context.aggregates[aggregate.0].array_type();
count_type_elements(context, ty) * *sz as usize
}
Type::Union(aggregate) => context.aggregates[aggregate.0]
.field_types()
.iter()
.map(|ty| count_type_elements(context, ty))
.max()
.unwrap_or(1),
Type::Struct(aggregate) => context.aggregates[aggregate.0]
.field_types()
.iter()
.map(|ty| count_type_elements(context, ty))
.sum(),
}
}

move |context: &Context, function: &Function, _call_site: &Value| -> bool {
max_blocks
.map(|max_block_count| function.num_blocks(context) <= max_block_count)
.unwrap_or(true)
&& max_instrs
.map(|max_instrs_count| function.num_instructions(context) <= max_instrs_count)
.unwrap_or(true)
&& max_stack_size
.map(|max_stack_size_count| {
function
.locals_iter(context)
.map(|(_name, ptr)| count_type_elements(context, ptr.get_type(context)))
.sum::<usize>()
<= max_stack_size_count
})
.unwrap_or(true)
}
}

/// Inline a function to a specific call site within another function.
///
/// The destination function, block and call site must be specified along with the function to
/// inline.
pub fn inline_function_call(
context: &mut Context,
function: Function,
Expand Down
46 changes: 46 additions & 0 deletions sway-ir/tests/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Notes on the Inliner Unit Testing

Each of the files in the `inline` directory are passed through the inliner and verified using
`FileCheck`.

## Parameters

The first line of the IR file must be a comment containing the parameters for the pass. These may
be:

* The single word `all`, indicating all `CALL`s found throughout the input will be inlined.
* A combination of sizes which are passed to the `optimize::inline::is_small_fn()` function:
* `blocks N` to indicate a maximum of `N` allowed blocks constraint.
* `instrs N` to indicate a maximum of `N` allowed instructions constraint.
* `stack N` to indicate a maximum of `N` for stack size constraint.

Any keyword found later in the line will override an earlier parameter. `all` will override any
other constraint.

### Example

To just inline everything:

```rust
// all
```

To inline only functions which have at most 2 blocks:

```rust
// blocks 2
```

To inline only functions which have at most 2 blocks, at most 20 instructions and no more than 10
stack elements:

```rust
// blocks 2 instrs 20 stack 10
```

See the source for `optimize::inline::is_small_fn()` for further clarification.

### Caveats

This is a little bit lame and perhaps a proper looking command line (and parser) would be better,
e.g., `// run --blocks 2 --instrs 20` but this will do for a start.
2 changes: 2 additions & 0 deletions sway-ir/tests/inline/bigger.ir
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
// all
//
// Based on this Sway:
//
// script;
Expand Down
Loading

0 comments on commit 1f675c1

Please sign in to comment.