Skip to content

Commit

Permalink
Add support for the static analyzer to synthesize function implementa…
Browse files Browse the repository at this point in the history
…tions from external model files.

Currently the analyzer lazily models some functions using 'BodyFarm',
which constructs a fake function implementation that the analyzer
can simulate that approximates the semantics of the function when
it is called.  BodyFarm does this by constructing the AST for
such definitions on-the-fly.  One strength of BodyFarm
is that all symbols and types referenced by synthesized function
bodies are contextual adapted to the containing translation unit.
The downside is that these ASTs are hardcoded in Clang's own
source code.

A more scalable model is to allow these models to be defined as source
code in separate "model" files and have the analyzer use those
definitions lazily when a function body is needed.  Among other things,
it will allow more customization of the analyzer for specific APIs
and platforms.

This patch provides the initial infrastructure for this feature.
It extends BodyFarm to use an abstract API 'CodeInjector' that can be
used to synthesize function bodies.  That 'CodeInjector' is
implemented using a new 'ModelInjector' in libFrontend, which lazily
parses a model file and injects the ASTs into the current translation
unit.  

Models are currently found by specifying a 'model-path' as an
analyzer option; if no path is specified the CodeInjector is not
used, thus defaulting to the current behavior in the analyzer.

Models currently contain a single function definition, and can
be found by finding the file <function name>.model.  This is an
initial starting point for something more rich, but it bootstraps
this feature for future evolution.

This patch was contributed by Gábor Horváth as part of his
Google Summer of Code project.

Some notes:

- This introduces the notion of a "model file" into
  FrontendAction and the Preprocessor.  This nomenclature
  is specific to the static analyzer, but possibly could be
  generalized.  Essentially these are sources pulled in
  exogenously from the principal translation.

  Preprocessor gets a 'InitializeForModelFile' and
  'FinalizeForModelFile' which could possibly be hoisted out
  of Preprocessor if Preprocessor exposed a new API to
  change the PragmaHandlers and some other internal pieces.  This
  can be revisited.

  FrontendAction gets a 'isModelParsingAction()' predicate function
  used to allow a new FrontendAction to recycle the Preprocessor
  and ASTContext.  This name could probably be made something
  more general (i.e., not tied to 'model files') at the expense
  of losing the intent of why it exists.  This can be revisited.

- This is a moderate sized patch; it has gone through some amount of
  offline code review.  Most of the changes to the non-analyzer
  parts are fairly small, and would make little sense without
  the analyzer changes.

- Most of the analyzer changes are plumbing, with the interesting
  behavior being introduced by ModelInjector.cpp and
  ModelConsumer.cpp.

- The new functionality introduced by this change is off-by-default.
  It requires an analyzer config option to enable.

git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@216550 91177308-0d34-0410-b5e6-96231b3b80d8
  • Loading branch information
tkremenek committed Aug 27, 2014
1 parent 702b970 commit fdf0d35
Show file tree
Hide file tree
Showing 28 changed files with 803 additions and 42 deletions.
17 changes: 16 additions & 1 deletion include/clang/Analysis/AnalysisContext.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,10 @@

#include "clang/AST/Decl.h"
#include "clang/Analysis/CFG.h"
#include "clang/Analysis/CodeInjector.h"
#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/FoldingSet.h"
#include "llvm/ADT/OwningPtr.h"
#include "llvm/Support/Allocator.h"
#include <memory>

Expand Down Expand Up @@ -143,6 +145,14 @@ class AnalysisDeclContext {
/// \sa getBody
bool isBodyAutosynthesized() const;

/// \brief Checks if the body of the Decl is generated by the BodyFarm from a
/// model file.
///
/// Note, the lookup is not free. We are going to call getBody behind
/// the scenes.
/// \sa getBody
bool isBodyAutosynthesizedFromModelFile() const;

CFG *getCFG();

CFGStmtMap *getCFGStmtMap();
Expand Down Expand Up @@ -398,6 +408,10 @@ class AnalysisDeclContextManager {
ContextMap Contexts;
LocationContextManager LocContexts;
CFG::BuildOptions cfgBuildOptions;

/// Pointer to an interface that can provide function bodies for
/// declarations from external source.
llvm::OwningPtr<CodeInjector> Injector;

/// Flag to indicate whether or not bodies should be synthesized
/// for well-known functions.
Expand All @@ -410,7 +424,8 @@ class AnalysisDeclContextManager {
bool addTemporaryDtors = false,
bool synthesizeBodies = false,
bool addStaticInitBranches = false,
bool addCXXNewAllocator = true);
bool addCXXNewAllocator = true,
CodeInjector* injector = nullptr);

~AnalysisDeclContextManager();

Expand Down
46 changes: 46 additions & 0 deletions include/clang/Analysis/CodeInjector.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
//===-- CodeInjector.h ------------------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
///
/// \file
/// \brief Defines the clang::CodeInjector interface which is responsible for
/// injecting AST of function definitions that may not be available in the
/// original source.
///
//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_ANALYSIS_CODEINJECTOR_H
#define LLVM_CLANG_ANALYSIS_CODEINJECTOR_H

namespace clang {

class Stmt;
class FunctionDecl;
class ObjCMethodDecl;

/// \brief CodeInjector is an interface which is responsible for injecting AST
/// of function definitions that may not be available in the original source.
///
/// The getBody function will be called each time the static analyzer examines a
/// function call that has no definition available in the current translation
/// unit. If the returned statement is not a null pointer, it is assumed to be
/// the body of a function which will be used for the analysis. The source of
/// the body can be arbitrary, but it is advised to use memoization to avoid
/// unnecessary reparsing of the external source that provides the body of the
/// functions.
class CodeInjector {
public:
CodeInjector();
virtual ~CodeInjector();

virtual Stmt *getBody(const FunctionDecl *D) = 0;
virtual Stmt *getBody(const ObjCMethodDecl *D) = 0;
};
}

#endif
1 change: 0 additions & 1 deletion include/clang/Basic/SourceManager.h
Original file line number Diff line number Diff line change
Expand Up @@ -754,7 +754,6 @@ class SourceManager : public RefCountedBase<SourceManager> {

/// \brief Set the file ID for the main source file.
void setMainFileID(FileID FID) {
assert(MainFileID.isInvalid() && "MainFileID already set!");
MainFileID = FID;
}

Expand Down
8 changes: 8 additions & 0 deletions include/clang/Frontend/FrontendAction.h
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,13 @@ class FrontendAction {
/// @name Supported Modes
/// @{

/// \brief Is this action invoked on a model file?
///
/// Model files are incomplete translation units that relies on type
/// information from another translation unit. Check ParseModelFileAction for
/// details.
virtual bool isModelParsingAction() const { return false; }

/// \brief Does this action only use the preprocessor?
///
/// If so no AST context will be created and this action will be invalid
Expand Down Expand Up @@ -224,6 +231,7 @@ class ASTFrontendAction : public FrontendAction {
void ExecuteAction() override;

public:
ASTFrontendAction() {}
bool usesPreprocessorOnly() const override { return false; }
};

Expand Down
15 changes: 15 additions & 0 deletions include/clang/Lex/Preprocessor.h
Original file line number Diff line number Diff line change
Expand Up @@ -194,6 +194,10 @@ class Preprocessor : public RefCountedBase<Preprocessor> {
/// with this preprocessor.
PragmaNamespace *PragmaHandlers;

/// \brief Pragma handlers of the original source is stored here during the
/// parsing of a model file.
PragmaNamespace *PragmaHandlersBackup;

/// \brief Tracks all of the comment handlers that the client registered
/// with this preprocessor.
std::vector<CommentHandler *> CommentHandlers;
Expand Down Expand Up @@ -464,6 +468,17 @@ class Preprocessor : public RefCountedBase<Preprocessor> {
/// lifetime of the preprocessor.
void Initialize(const TargetInfo &Target);

/// \brief Initialize the preprocessor to parse a model file
///
/// To parse model files the preprocessor of the original source is reused to
/// preserver the identifier table. However to avoid some duplicate
/// information in the preprocessor some cleanup is needed before it is used
/// to parse model files. This method does that cleanup.
void InitializeForModelFile();

/// \brief Cleanup after model file parsing
void FinalizeForModelFile();

/// \brief Retrieve the preprocessor options used to initialize this
/// preprocessor.
PreprocessorOptions &getPreprocessorOpts() const { return *PPOpts; }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@

namespace clang {

class CodeInjector;

namespace ento {
class CheckerManager;

Expand Down Expand Up @@ -50,7 +52,8 @@ class AnalysisManager : public BugReporterData {
StoreManagerCreator storemgr,
ConstraintManagerCreator constraintmgr,
CheckerManager *checkerMgr,
AnalyzerOptions &Options);
AnalyzerOptions &Options,
CodeInjector* injector = nullptr);

~AnalysisManager();

Expand Down
5 changes: 3 additions & 2 deletions include/clang/StaticAnalyzer/Frontend/AnalysisConsumer.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@ namespace clang {

class Preprocessor;
class DiagnosticsEngine;
class CodeInjector;
class CompilerInstance;

namespace ento {
class CheckerManager;
Expand All @@ -38,8 +40,7 @@ class AnalysisASTConsumer : public ASTConsumer {
/// analysis passes. (The set of analyses run is controlled by command-line
/// options.)
std::unique_ptr<AnalysisASTConsumer>
CreateAnalysisConsumer(const Preprocessor &pp, const std::string &output,
AnalyzerOptionsRef opts, ArrayRef<std::string> plugins);
CreateAnalysisConsumer(CompilerInstance &CI);

} // end GR namespace

Expand Down
25 changes: 25 additions & 0 deletions include/clang/StaticAnalyzer/Frontend/FrontendActions.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,13 @@
#define LLVM_CLANG_STATICANALYZER_FRONTEND_FRONTENDACTIONS_H

#include "clang/Frontend/FrontendAction.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/StringMap.h"

namespace clang {

class Stmt;

namespace ento {

//===----------------------------------------------------------------------===//
Expand All @@ -26,6 +30,27 @@ class AnalysisAction : public ASTFrontendAction {
StringRef InFile) override;
};

/// \brief Frontend action to parse model files.
///
/// This frontend action is responsible for parsing model files. Model files can
/// not be parsed on their own, they rely on type information that is available
/// in another translation unit. The parsing of model files is done by a
/// separate compiler instance that reuses the ASTContext and othen information
/// from the main translation unit that is being compiled. After a model file is
/// parsed, the function definitions will be collected into a StringMap.
class ParseModelFileAction : public ASTFrontendAction {
public:
ParseModelFileAction(llvm::StringMap<Stmt *> &Bodies);
bool isModelParsingAction() const override { return true; }

protected:
std::unique_ptr<ASTConsumer> CreateASTConsumer(CompilerInstance &CI,
StringRef InFile) override;

private:
llvm::StringMap<Stmt *> &Bodies;
};

void printCheckerHelp(raw_ostream &OS, ArrayRef<std::string> plugins);

} // end GR namespace
Expand Down
44 changes: 44 additions & 0 deletions include/clang/StaticAnalyzer/Frontend/ModelConsumer.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
//===-- ModelConsumer.h -----------------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
///
/// \file
/// \brief This file implements clang::ento::ModelConsumer which is an
/// ASTConsumer for model files.
///
//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_GR_MODELCONSUMER_H
#define LLVM_CLANG_GR_MODELCONSUMER_H

#include "clang/AST/ASTConsumer.h"
#include "llvm/ADT/StringMap.h"

namespace clang {

class Stmt;

namespace ento {

/// \brief ASTConsumer to consume model files' AST.
///
/// This consumer collects the bodies of function definitions into a StringMap
/// from a model file.
class ModelConsumer : public ASTConsumer {
public:
ModelConsumer(llvm::StringMap<Stmt *> &Bodies);

bool HandleTopLevelDecl(DeclGroupRef D) override;

private:
llvm::StringMap<Stmt *> &Bodies;
};
}
}

#endif
20 changes: 14 additions & 6 deletions lib/Analysis/AnalysisDeclContext.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,9 @@ AnalysisDeclContextManager::AnalysisDeclContextManager(bool useUnoptimizedCFG,
bool addTemporaryDtors,
bool synthesizeBodies,
bool addStaticInitBranch,
bool addCXXNewAllocator)
: SynthesizeBodies(synthesizeBodies)
bool addCXXNewAllocator,
CodeInjector *injector)
: Injector(injector), SynthesizeBodies(synthesizeBodies)
{
cfgBuildOptions.PruneTriviallyFalseEdges = !useUnoptimizedCFG;
cfgBuildOptions.AddImplicitDtors = addImplicitDtors;
Expand All @@ -84,8 +85,8 @@ void AnalysisDeclContextManager::clear() {
llvm::DeleteContainerSeconds(Contexts);
}

static BodyFarm &getBodyFarm(ASTContext &C) {
static BodyFarm *BF = new BodyFarm(C);
static BodyFarm &getBodyFarm(ASTContext &C, CodeInjector *injector = nullptr) {
static BodyFarm *BF = new BodyFarm(C, injector);
return *BF;
}

Expand All @@ -94,7 +95,7 @@ Stmt *AnalysisDeclContext::getBody(bool &IsAutosynthesized) const {
if (const FunctionDecl *FD = dyn_cast<FunctionDecl>(D)) {
Stmt *Body = FD->getBody();
if (!Body && Manager && Manager->synthesizeBodies()) {
Body = getBodyFarm(getASTContext()).getBody(FD);
Body = getBodyFarm(getASTContext(), Manager->Injector.get()).getBody(FD);
if (Body)
IsAutosynthesized = true;
}
Expand All @@ -103,7 +104,7 @@ Stmt *AnalysisDeclContext::getBody(bool &IsAutosynthesized) const {
else if (const ObjCMethodDecl *MD = dyn_cast<ObjCMethodDecl>(D)) {
Stmt *Body = MD->getBody();
if (!Body && Manager && Manager->synthesizeBodies()) {
Body = getBodyFarm(getASTContext()).getBody(MD);
Body = getBodyFarm(getASTContext(), Manager->Injector.get()).getBody(MD);
if (Body)
IsAutosynthesized = true;
}
Expand All @@ -128,6 +129,13 @@ bool AnalysisDeclContext::isBodyAutosynthesized() const {
return Tmp;
}

bool AnalysisDeclContext::isBodyAutosynthesizedFromModelFile() const {
bool Tmp;
Stmt *Body = getBody(Tmp);
return Tmp && Body->getLocStart().isValid();
}


const ImplicitParamDecl *AnalysisDeclContext::getSelfDecl() const {
if (const ObjCMethodDecl *MD = dyn_cast<ObjCMethodDecl>(D))
return MD->getSelfDecl();
Expand Down
2 changes: 2 additions & 0 deletions lib/Analysis/BodyFarm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
//===----------------------------------------------------------------------===//

#include "BodyFarm.h"
#include "clang/Analysis/CodeInjector.h"
#include "clang/AST/ASTContext.h"
#include "clang/AST/Decl.h"
#include "clang/AST/Expr.h"
Expand Down Expand Up @@ -381,6 +382,7 @@ Stmt *BodyFarm::getBody(const FunctionDecl *D) {
}

if (FF) { Val = FF(C, D); }
else if (Injector) { Val = Injector->getBody(D); }
return Val.getValue();
}

Expand Down
4 changes: 3 additions & 1 deletion lib/Analysis/BodyFarm.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,11 @@ class FunctionDecl;
class ObjCMethodDecl;
class ObjCPropertyDecl;
class Stmt;
class CodeInjector;

class BodyFarm {
public:
BodyFarm(ASTContext &C) : C(C) {}
BodyFarm(ASTContext &C, CodeInjector *injector) : C(C), Injector(injector) {}

/// Factory method for creating bodies for ordinary functions.
Stmt *getBody(const FunctionDecl *D);
Expand All @@ -43,6 +44,7 @@ class BodyFarm {

ASTContext &C;
BodyMap Bodies;
CodeInjector *Injector;
};
}

Expand Down
1 change: 1 addition & 0 deletions lib/Analysis/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ add_clang_library(clangAnalysis
CallGraph.cpp
CocoaConventions.cpp
Consumed.cpp
CodeInjector.cpp
Dominators.cpp
DataflowWorklist.cpp
FormatString.cpp
Expand Down
15 changes: 15 additions & 0 deletions lib/Analysis/CodeInjector.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
//===-- CodeInjector.cpp ----------------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//

#include "clang/Analysis/CodeInjector.h"

using namespace clang;

CodeInjector::CodeInjector() {}
CodeInjector::~CodeInjector() {}
Loading

0 comments on commit fdf0d35

Please sign in to comment.