A python interface into Valgrind's VEX IR! This was created mainly to utilize VEX for static analysis, but it would be cool to integrate this with Valgrind as well. To that end, I've started writing pygrind to pass instrumentation over to Python, but this doesn't work yet.
For now, pyvex requires valgrind to be compiled with fPIC:
mkdir ~/valgrind
cd ~/valgrind
wget http://valgrind.org/downloads/valgrind-3.8.1.tar.bz2
tar xvfj valgrind-3.8.1.tar.bz2
cd valgrind-3.8.1
CFLAGS=-fPIC ./configure --prefix=$HOME/valgrind/inst
make
make install
Great! Now you can build pyvex.
python setup.py build
Sweet! Now you'll notice that two libraries are built. pyvex.so is pyvex with all the functionality, and pyvex_dynamic is pyvex without the ability to statically create IRSBs from provided bytes. With the latter, the only pre-made IRSBs would presumably come from Valgrind at runtime, but since that doesn't work yet, the latter one is rather useless.
You can use pyvex pretty easily. For now, it only supports translation and pretty printing:
import pyvex
irsb = pyvex.IRSB(bytes="\x55\xc3") # translates "push ebp; ret" to VEX IR
irsb.pp() # prints the VEX IR
Awesome stuff!
- Get pyvex working in Valgrind, dynamically.
- this requires getting the python interpreter to play nice with Valgrind. It's unclear if this is possible.
- Debug this stuff.
- Some class members are named incorrectly. I started out trying to name things nicer, but then realized that the naming should be consistent with the C structs. The inconsistencies should be fixed.
- help() is sorely lacking
- The class objects for the different sub-statements, sub-expressions, and sub-constants get inherited for instances of these classes. This is kindof ugly (ie, pyvex.IRStmt.NoOp().WrTmp is a valid reference to the WrTmp class).
- pretty-printing an emptyIRSB segfaults
- when used statically, memory is never freed
- converting from string to tag is currently very slow (a hastily written consecutive bunch of strcmps)
- IRCallee assumes that addresses are 64-bytes long, and will corrupt memory otherwise. This can be fixed by writing a getter/setter instead of using the macroed ones.
- CCalls are created by creating the IRCallee and manually building the args list, instead of by calling the helper functions. Not sure if this is good or bad. On the other hand, Dirty statements are created through helper functions.
- deepCopying a binder IRExpr seems to crash VEX
- deepCopying a V256 const is not implemented by VEX's deepCopy stuff
- IRDirty's fxState array access is untested
- equality (for those things that easily have it) should be implemented as a rich comparator