Skip to content

Commit

Permalink
[ABI] Introduce indirect symbolic references to context descriptors.
Browse files Browse the repository at this point in the history
Extending the mangling of symbolic references to also include indirect
symbolic references. This allows mangled names to refer to context
descriptors (both type and protocol) not in the current source file.

For now, only permit indirect symbolic references within the current module,
because remote mirrors (among other things) is unable to handle relocations.

Co-authored-by: Joe Groff <[email protected]>
  • Loading branch information
DougGregor and jckarter committed Oct 23, 2018
1 parent 8d3da66 commit 5b41ac1
Show file tree
Hide file tree
Showing 23 changed files with 610 additions and 233 deletions.
62 changes: 61 additions & 1 deletion docs/ABI/Mangling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,48 @@ mangled name will start with the module name (after the ``_S``).
In the following, productions which are only _part_ of an operator, are
named with uppercase letters.

Symbolic references
~~~~~~~~~~~~~~~~~~~

The Swift compiler emits mangled names into binary images to encode
references to types for runtime instantiation and reflection. In a binary,
these mangled names may embed pointers to runtime data
structures in order to more efficiently represent locally-defined types.
We call these pointers **symbolic references**.
These references will be introduced by a control character in the range
`\x01` ... `\x1F`, which indicates the kind of symbolic reference, followed by
some number of arbitrary bytes *which may include null bytes*. Code that
processes mangled names out of Swift binaries needs to be aware of symbolic
references in order to properly terminate strings; a null terminator may be
part of a symbolic reference.

::

symbolic-reference ::= [\x01-\x17] .{4} // Relative symbolic reference
#if sizeof(void*) == 8
symbolic-reference ::= [\x18-\x1F] .{8} // Absolute symbolic reference
#elif sizeof(void*) == 4
symbolic-reference ::= [\x18-\x1F] .{4} // Absolute symbolic reference
#endif

Symbolic references are only valid in compiler-emitted metadata structures
and must only appear in read-only parts of a binary image. APIs and tools
that interpret Swift mangled names from potentially uncontrolled inputs must
refuse to interpret symbolic references.

The following symbolic reference kinds are currently implemented:

::

{any-generic-type, protocol} ::= '\x01' .{4} // Reference points directly to context descriptor
{any-generic-type, protocol} ::= '\x02' .{4} // Reference points indirectly to context descriptor
// The grammatical role of the symbolic reference is determined by the
// kind of context descriptor referenced

protocol-conformance-ref ::= '\x03' .{4} // Reference points directly to protocol conformance descriptor (NOT IMPLEMENTED)
protocol-conformance-ref ::= '\x04' .{4} // Reference points indirectly to protocol conformance descriptor (NOT IMPLEMENTED)


Globals
~~~~~~~

Expand Down Expand Up @@ -553,18 +595,36 @@ Generics

::

protocol-conformance ::= type protocol module generic-signature?
protocol-conformance-context ::= protocol module generic-signature?

protocol-conformance ::= type protocol-conformance-context

``<protocol-conformance>`` refers to a type's conformance to a protocol. The
named module is the one containing the extension or type declaration that
declared the conformance.

::

protocol-conformance ::= type protocol

If ``type`` is a generic parameter or associated type of one, then no module
is mangled, because the conformance must be resolved from the generic
environment.

protocol-conformance ::= context identifier protocol identifier generic-signature? // Property behavior conformance

Property behaviors are implemented using private protocol conformances.

::

concrete-protocol-conformance ::= type protocol-conformance-ref
protocol-conformance-ref ::= protocol module?

A compact representation used to represent mangled protocol conformance witness
arguments at runtime. The ``module`` is only specified for conformances that
are "retroactive", meaning that the context in which the conformance is defined
is in neither the protocol or type module.

::

generic-signature ::= requirement* 'l' // one generic parameter
Expand Down
24 changes: 17 additions & 7 deletions include/swift/AST/ASTMangler.h
Original file line number Diff line number Diff line change
Expand Up @@ -46,12 +46,22 @@ class ASTMangler : public Mangler {
/// If disabled, it is an error to try to mangle such an entity.
bool AllowNamelessEntities = false;

/// If nonnull, provides a callback to encode symbolic references to
/// type contexts.
std::function<bool (const DeclContext *Context)>
CanSymbolicReference;

std::vector<std::pair<const DeclContext *, unsigned>> SymbolicReferences;
/// If enabled, some entities will be emitted as symbolic reference
/// placeholders. The offsets of these references will be stored in the
/// `SymbolicReferences` vector, and it is up to the consumer of the mangling
/// to fill these in.
bool AllowSymbolicReferences = false;

public:
using SymbolicReferent = llvm::PointerUnion<const NominalTypeDecl *,
const ProtocolConformance *>;
protected:

/// If set, the mangler calls this function to determine whether to symbolic
/// reference a given entity. Defaults to always returning true.
std::function<bool (SymbolicReferent)> CanSymbolicReference;

std::vector<std::pair<SymbolicReferent, unsigned>> SymbolicReferences;

public:
enum class SymbolKind {
Expand Down Expand Up @@ -292,7 +302,7 @@ class ASTMangler : public Mangler {

void appendOpParamForLayoutConstraint(LayoutConstraint Layout);

void appendSymbolicReference(const DeclContext *context);
void appendSymbolicReference(SymbolicReferent referent);

std::string mangleTypeWithoutPrefix(Type type) {
appendType(type);
Expand Down
8 changes: 7 additions & 1 deletion include/swift/Demangling/Demangle.h
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ namespace llvm {
namespace swift {
namespace Demangle {

enum class SymbolicReferenceKind : uint8_t;

struct DemangleOptions {
bool SynthesizeSugarOnTypes = false;
bool DisplayDebuggerGeneratedModule = true;
Expand Down Expand Up @@ -473,7 +475,9 @@ void mangleIdentifier(const char *data, size_t length,
/// This should always round-trip perfectly with demangleSymbolAsNode.
std::string mangleNode(const NodePointer &root);

using SymbolicResolver = llvm::function_ref<Demangle::NodePointer (const void *)>;
using SymbolicResolver =
llvm::function_ref<Demangle::NodePointer (SymbolicReferenceKind,
const void *)>;

/// \brief Remangle a demangled parse tree, using a callback to resolve
/// symbolic references.
Expand Down Expand Up @@ -537,6 +541,8 @@ class DemanglerPrinter {
return std::move(*this << std::forward<T>(x));
}

DemanglerPrinter &writeHex(unsigned long long n) &;

std::string &&str() && { return std::move(Stream); }

llvm::StringRef getStringRef() const { return Stream; }
Expand Down
4 changes: 2 additions & 2 deletions include/swift/Demangling/DemangleNodes.def
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,7 @@ NODE(PrefixOperator)
NODE(PrivateDeclName)
NODE(PropertyDescriptor)
CONTEXT_NODE(Protocol)
CONTEXT_NODE(ProtocolSymbolicReference)
NODE(ProtocolConformance)
NODE(ProtocolDescriptor)
NODE(ProtocolConformanceDescriptor)
Expand Down Expand Up @@ -172,13 +173,13 @@ NODE(SpecializationIsFragile)
CONTEXT_NODE(Static)
CONTEXT_NODE(Structure)
CONTEXT_NODE(Subscript)
CONTEXT_NODE(SymbolicReference)
NODE(Suffix)
NODE(ThinFunctionType)
NODE(Tuple)
NODE(TupleElement)
NODE(TupleElementName)
NODE(Type)
CONTEXT_NODE(TypeSymbolicReference)
CONTEXT_NODE(TypeAlias)
NODE(TypeList)
NODE(TypeMangling)
Expand All @@ -192,7 +193,6 @@ NODE(TypeMetadataLazyCache)
NODE(UncurriedFunctionType)
#define REF_STORAGE(Name, ...) NODE(Name)
#include "swift/AST/ReferenceStorage.def"
CONTEXT_NODE(UnresolvedSymbolicReference)
CONTEXT_NODE(UnsafeAddressor)
CONTEXT_NODE(UnsafeMutableAddressor)
NODE(ValueWitness)
Expand Down
18 changes: 15 additions & 3 deletions include/swift/Demangling/Demangler.h
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,17 @@ class CharVector : public Vector<char> {
}
};

/// Kinds of symbolic reference supported.
enum class SymbolicReferenceKind : uint8_t {
/// A symbolic reference to a context descriptor, representing the
/// (unapplied generic) context.
Context,
};

using SymbolicReferenceResolver_t = NodePointer (SymbolicReferenceKind,
Directness,
int32_t, const void *);

/// The demangler.
///
/// It de-mangles a string and it also owns the returned node-tree. This means
Expand All @@ -301,7 +312,7 @@ class Demangler : public NodeFactory {
StringRef Words[MaxNumWords];
int NumWords = 0;

std::function<NodePointer (int32_t, const void *)> SymbolicReferenceResolver;
std::function<SymbolicReferenceResolver_t> SymbolicReferenceResolver;

bool nextIf(StringRef str) {
if (!Text.substr(Pos).startswith(str)) return false;
Expand Down Expand Up @@ -472,7 +483,8 @@ class Demangler : public NodeFactory {

NodePointer demangleObjCTypeName();
NodePointer demangleTypeMangling();
NodePointer demangleSymbolicReference(const void *at);
NodePointer demangleSymbolicReference(unsigned char rawKind,
const void *at);

void dump();

Expand All @@ -483,7 +495,7 @@ class Demangler : public NodeFactory {

/// Install a resolver for symbolic references in a mangled string.
void setSymbolicReferenceResolver(
std::function<NodePointer (int32_t, const void*)> resolver) {
std::function<SymbolicReferenceResolver_t> resolver) {
SymbolicReferenceResolver = resolver;
}

Expand Down
19 changes: 10 additions & 9 deletions include/swift/Demangling/TypeDecoder.h
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ class TypeDecoder {
case NodeKind::Enum:
case NodeKind::Structure:
case NodeKind::TypeAlias: // This can show up for imported Clang decls.
case NodeKind::SymbolicReference:
case NodeKind::TypeSymbolicReference:
{
BuiltNominalTypeDecl typeDecl = BuiltNominalTypeDecl();
BuiltType parent = BuiltType();
Expand Down Expand Up @@ -228,7 +228,8 @@ class TypeDecoder {
IsClassBound);
}

case NodeKind::Protocol: {
case NodeKind::Protocol:
case NodeKind::ProtocolSymbolicReference: {
if (auto Proto = decodeMangledProtocolType(Node)) {
return Builder.createProtocolCompositionType(Proto, BuiltType(),
/*IsClassBound=*/false);
Expand Down Expand Up @@ -473,14 +474,14 @@ class TypeDecoder {
}

private:
bool decodeMangledNominalType(const Demangle::NodePointer &node,
bool decodeMangledNominalType(Demangle::NodePointer node,
BuiltNominalTypeDecl &typeDecl,
BuiltType &parent) {
if (node->getKind() == NodeKind::Type)
return decodeMangledNominalType(node->getChild(0), typeDecl, parent);

Demangle::NodePointer nominalNode;
if (node->getKind() == NodeKind::SymbolicReference) {
if (node->getKind() == NodeKind::TypeSymbolicReference) {
// A symbolic reference can be directly resolved to a nominal type.
nominalNode = node;
} else {
Expand Down Expand Up @@ -519,19 +520,19 @@ class TypeDecoder {
return true;
}

BuiltProtocolDecl decodeMangledProtocolType(
const Demangle::NodePointer &node) {
BuiltProtocolDecl decodeMangledProtocolType(Demangle::NodePointer node) {
if (node->getKind() == NodeKind::Type)
return decodeMangledProtocolType(node->getChild(0));

if (node->getNumChildren() < 2 || node->getKind() != NodeKind::Protocol)
if ((node->getNumChildren() < 2 || node->getKind() != NodeKind::Protocol)
&& node->getKind() != NodeKind::ProtocolSymbolicReference)
return BuiltProtocolDecl();

return Builder.createProtocolDecl(node);
}

bool decodeMangledFunctionInputType(
const Demangle::NodePointer &node,
Demangle::NodePointer node,
std::vector<FunctionParam<BuiltType>> &params,
bool &hasParamFlags) {
// Look through a couple of sugar nodes.
Expand All @@ -542,7 +543,7 @@ class TypeDecoder {
}

auto decodeParamTypeAndFlags =
[&](const Demangle::NodePointer &typeNode,
[&](Demangle::NodePointer typeNode,
FunctionParam<BuiltType> &param) -> bool {
Demangle::NodePointer node = typeNode;

Expand Down
31 changes: 23 additions & 8 deletions include/swift/Reflection/TypeRefBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -354,15 +354,30 @@ class TypeRefBuilder {
// demangling out of the referenced context descriptors in the target
// process.
Dem.setSymbolicReferenceResolver(
[this, &reader](int32_t offset, const void *base) -> Demangle::NodePointer {
// Resolve the reference to a remote address.
auto remoteAddress = getRemoteAddrOfTypeRefPointer(base);
if (remoteAddress == 0)
[this, &reader](SymbolicReferenceKind kind,
Directness directness,
int32_t offset, const void *base) -> Demangle::NodePointer {
// Resolve the reference to a remote address.
auto remoteAddress = getRemoteAddrOfTypeRefPointer(base);
if (remoteAddress == 0)
return nullptr;

auto address = remoteAddress + offset;
if (directness == Directness::Indirect) {
if (auto indirectAddress = reader.readPointerValue(address)) {
address = *indirectAddress;
} else {
return nullptr;

return reader.readDemanglingForContextDescriptor(remoteAddress + offset,
Dem);
});
}
}

switch (kind) {
case Demangle::SymbolicReferenceKind::Context:
return reader.readDemanglingForContextDescriptor(address, Dem);
}

return nullptr;
});
}

TypeConverter &getTypeConverter() { return TC; }
Expand Down
Loading

0 comments on commit 5b41ac1

Please sign in to comment.