forked from apple/swift-clang
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@81678 91177308-0d34-0410-b5e6-96231b3b80d8
- Loading branch information
Showing
1 changed file
with
17 additions
and
169 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,178 +1,26 @@ | ||
//===----------------------------------------------------------------------===// | ||
// C Language Family Front-end | ||
//===----------------------------------------------------------------------===// | ||
Chris Lattner | ||
|
||
I. Introduction: | ||
|
||
clang: noun | ||
1. A loud, resonant, metallic sound. | ||
2. The strident call of a crane or goose. | ||
3. C-language family front-end toolkit. | ||
Welcome to Clang. This is a compiler front-end for the C family of languages | ||
(C, C++, Objective-C, and Objective-C++) which is built as part of the LLVM | ||
compiler intrastructure project. | ||
|
||
The world needs better compiler tools, tools which are built as libraries. This | ||
design point allows reuse of the tools in new and novel ways. However, building | ||
the tools as libraries isn't enough: they must have clean APIs, be as | ||
decoupled from each other as possible, and be easy to modify/extend. This | ||
requires clean layering, decent design, and avoiding tying the libraries to a | ||
specific use. Oh yeah, did I mention that we want the resultant libraries to | ||
be as fast as possible? :) | ||
Unlike many other compiler frontends, Clang is useful for a number of things | ||
beyond just compiling code: we intend for Clang to be host to a number of | ||
different source level tools. One example of this is the Clang Static Analyzer. | ||
|
||
This front-end is built as a component of the LLVM toolkit that can be used | ||
with the LLVM backend or independently of it. In this spirit, the API has been | ||
carefully designed as the following components: | ||
|
||
libsupport - Basic support library, reused from LLVM. | ||
If you're interested in more (including how to build Clang) it is best to read | ||
the relevant web sites. Here are some pointers: | ||
|
||
libsystem - System abstraction library, reused from LLVM. | ||
|
||
libbasic - Diagnostics, SourceLocations, SourceBuffer abstraction, | ||
file system caching for input source files. This depends on | ||
libsupport and libsystem. | ||
Information on Clang: http://clang.llvm.org/ | ||
Building and using Clang: http://clang.llvm.org/get_started.html | ||
Clang Static Analyzer: http://clang-analyzer.llvm.org/ | ||
Information on the LLVM project: http://llvm.org/ | ||
|
||
libast - Provides classes to represent the C AST, the C type system, | ||
builtin functions, and various helpers for analyzing and | ||
manipulating the AST (visitors, pretty printers, etc). This | ||
library depends on libbasic. | ||
|
||
|
||
liblex - C/C++/ObjC lexing and preprocessing, identifier hash table, | ||
pragma handling, tokens, and macros. This depends on libbasic. | ||
|
||
libparse - C (for now) parsing and local semantic analysis. This library | ||
invokes coarse-grained 'Actions' provided by the client to do | ||
stuff (e.g. libsema builds ASTs). This depends on liblex. | ||
|
||
libsema - Provides a set of parser actions to build a standardized AST | ||
for programs. AST's are 'streamed' out a top-level declaration | ||
at a time, allowing clients to use decl-at-a-time processing, | ||
build up entire translation units, or even build 'whole | ||
program' ASTs depending on how they use the APIs. This depends | ||
on libast and libparse. | ||
|
||
librewrite - Fast, scalable rewriting of source code. This operates on | ||
the raw syntactic text of source code, allowing a client | ||
to insert and delete text in very large source files using | ||
the same source location information embedded in ASTs. This | ||
is intended to be a low-level API that is useful for | ||
higher-level clients and libraries such as code refactoring. | ||
|
||
libanalysis - Source-level dataflow analysis useful for performing analyses | ||
such as computing live variables. It also includes a | ||
path-sensitive "graph-reachability" engine for writing | ||
analyses that reason about different possible paths of | ||
execution through source code. This is currently being | ||
employed to write a set of checks for finding bugs in software. | ||
|
||
libcodegen - Lower the AST to LLVM IR for optimization & codegen. Depends | ||
on libast. | ||
|
||
clang - An example driver, client of the libraries at various levels. | ||
This depends on all these libraries, and on LLVM VMCore. | ||
|
||
This front-end has been intentionally built as a DAG of libraries, making it | ||
easy to reuse individual parts or replace pieces if desired. For example, to | ||
build a preprocessor, you take the Basic and Lexer libraries. If you want an | ||
indexer, you take those plus the Parser library and provide some actions for | ||
indexing. If you want a refactoring, static analysis, or source-to-source | ||
compiler tool, it makes sense to take those plus the AST building and semantic | ||
analyzer library. Finally, if you want to use this with the LLVM backend, | ||
you'd take these components plus the AST to LLVM lowering code. | ||
|
||
In the future I hope this toolkit will grow to include new and interesting | ||
components, including a C++ front-end, ObjC support, and a whole lot of other | ||
things. | ||
|
||
Finally, it should be pointed out that the goal here is to build something that | ||
is high-quality and industrial-strength: all the obnoxious features of the C | ||
family must be correctly supported (trigraphs, preprocessor arcana, K&R-style | ||
prototypes, GCC/MS extensions, etc). It cannot be used if it is not 'real'. | ||
|
||
|
||
II. Usage of clang driver: | ||
|
||
* Basic Command-Line Options: | ||
- Help: clang --help | ||
- Standard GCC options accepted: -E, -I*, -i*, -pedantic, -std=c90, etc. | ||
- To make diagnostics more gcc-like: -fno-caret-diagnostics -fno-show-column | ||
- Enable metric printing: -stats | ||
|
||
* -fsyntax-only is currently the default mode. | ||
|
||
* -E mode works the same way as GCC. | ||
|
||
* -Eonly mode does all preprocessing, but does not print the output, | ||
useful for timing the preprocessor. | ||
|
||
* -fsyntax-only is currently partially implemented, lacking some | ||
semantic analysis (some errors and warnings are not produced). | ||
|
||
* -parse-noop parses code without building an AST. This is useful | ||
for timing the cost of the parser without including AST building | ||
time. | ||
|
||
* -parse-ast builds ASTs, but doesn't print them. This is most | ||
useful for timing AST building vs -parse-noop. | ||
|
||
* -parse-ast-print pretty prints most expression and statements nodes. | ||
|
||
* -parse-ast-check checks that diagnostic messages that are expected | ||
are reported and that those which are reported are expected. | ||
|
||
* -dump-cfg builds ASTs and then CFGs. CFGs are then pretty-printed. | ||
|
||
* -view-cfg builds ASTs and then CFGs. CFGs are then visualized by | ||
invoking Graphviz. | ||
|
||
For more information on getting Graphviz to work with clang/LLVM, | ||
see: http://llvm.org/docs/ProgrammersManual.html#ViewGraph | ||
|
||
|
||
III. Current advantages over GCC: | ||
|
||
* Column numbers are fully tracked (no 256 col limit, no GCC-style pruning). | ||
* All diagnostics have column numbers, includes 'caret diagnostics', and they | ||
highlight regions of interesting code (e.g. the LHS and RHS of a binop). | ||
* Full diagnostic customization by client (can format diagnostics however they | ||
like, e.g. in an IDE or refactoring tool) through DiagnosticClient interface. | ||
* Built as a framework, can be reused by multiple tools. | ||
* All languages supported linked into same library (no cc1,cc1obj, ...). | ||
* mmap's code in read-only, does not dirty the pages like GCC (mem footprint). | ||
* LLVM License, can be linked into non-GPL projects. | ||
* Full diagnostic control, per diagnostic. Diagnostics are identified by ID. | ||
* Significantly faster than GCC at semantic analysis, parsing, preprocessing | ||
and lexing. | ||
* Defers exposing platform-specific stuff to as late as possible, tracks use of | ||
platform-specific features (e.g. #ifdef PPC) to allow 'portable bytecodes'. | ||
* The lexer doesn't rely on the "lexer hack": it has no notion of scope and | ||
does not categorize identifiers as types or variables -- this is up to the | ||
parser to decide. | ||
|
||
Potential Future Features: | ||
|
||
* Fine grained diag control within the source (#pragma enable/disable warning). | ||
* Better token tracking within macros? (Token came from this line, which is | ||
a macro argument instantiated here, recursively instantiated here). | ||
* Fast #import with a module system. | ||
* Dependency tracking: change to header file doesn't recompile every function | ||
that texually depends on it: recompile only those functions that need it. | ||
This is aka 'incremental parsing'. | ||
|
||
|
||
IV. Missing Functionality / Improvements | ||
|
||
Lexer: | ||
* Source character mapping. GCC supports ASCII and UTF-8. | ||
See GCC options: -ftarget-charset and -ftarget-wide-charset. | ||
* Universal character support. Experimental in GCC, enabled with | ||
-fextended-identifiers. | ||
* -fpreprocessed mode. | ||
|
||
Preprocessor: | ||
* #assert/#unassert | ||
* MSExtension: "L#param" stringizes to a wide string literal. | ||
* Add support for -M* | ||
|
||
Traditional Preprocessor: | ||
* Currently, we have none. :) | ||
If you have questions or comments about Clang, a great place to discuss them is | ||
on the Clang development mailing list: | ||
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev | ||
|
||
If you find a bug in Clang, please file it in the LLVM bug tracker: | ||
http://llvm.org/bugs/ |