Learning Robotic Manipulation of Granular Media (CoRL 2017)



  • This paper examines the problem of robotic manipulation of graunular media, where we learn predictive models of granular media dynamics to perform scooping and dumping actions.
  • Posted to arxiv. (link)
  • Submission video (link)

Human Pose (CVPR 2017)



  • State-of-the-art RGB human pose on MSCOCO.
  • Posted to arxiv. (link)
  • Slides from an invited talk at RSS 2017 Articulated Tacking Workshop (link)

Fluid Simulation (ICML 2017)



  • A learning-based system for simulating fluids in real-time.
  • Posted to arxiv as pre-print. (link)
  • Project page is now up (code and data coming soon)!

Open Source Tools

  • I've written or contributed to many OSS tools over the years. This is a short list of some:
    • jtorch - Torch7 Utility Library for running models in OpenCL / C++
    • jcl - OpenCL Wrapper (to make OpenCL easier)
    • torchzlib - A utility library for zlib compression / decompression of Torch7 tensors
    • matlabnoise - Matlab procedural noise library
    • matlabobj - Matlab obj reader
    • torch7 - I'm a reasonably regular contributor to torch and it's various packages
    • icp - A C++ Iterative Closest Point library (with Matlab interface)
    • ik - A very simple inverse kinematics library (in C++)
    • ModelFit - Off-line fitting portion of the hand-tracking paper below
    • jzmq - A ZeroMQ Utility Library (C++)
    • There are probably others... See my github page (link) for more details

PhD Thesis (2015)



Motion Capture (CVPR 2015)



  • Joint work with the amazing folks over at MPII. (They definitely did the lions share of the work on this project!)
  • State-of-the-art system for obtaining high quality motion capture in arbitrary scenes from a low number of cameras.
  • Accepted to CVPR 2015. (pdf) (project page)

Body Tracking (CVPR 2015)



  • Improved our own state-of-the-art (NIPS 2014, ACCV 2014) by introducing:
    • A novel cascaded architecture to help overcome the effects of MaxPooling.
    • A modified dropout that works better in the presence of spatially-coherent activations.
  • Accepted to CVPR 2015 and posted on arxiv (pdf).
  • Here are the predictions for our model: Note: The MPII dataset analysis code has been verified against the code at MPII (so it's correct and fair). I'm supplying the validation set (which has ground-truth joint positions) so that you can make sure your own code matches ours. Particularly, the MPII guys apply a head-box scale of 0.6 before calculating PCKh (making performance worse), which a lot of people miss because it's not mentioned in their paper.
  • We turked the MPII images for images that contain only a single person. The image list can be found here.
  • The Matlab array of our test and training set can be found here.
  • Here are the Matlab figures (from the paper) for FLIC and MPII: figures_flic_mpii.zip.

Local Image Descriptor (Submitted Work)



  • Designed a SIFT-like descriptor, which outperforms all existing state-of-the-art.
  • More details will be available soon!
  • Submitted to major conference (decision pending).

Body Tracking (NIPS 2014)



  • Following ICLR 2014: we substantially improved the architecture, incorporated the MRF into the ConvNet and significantly outperformed existing state-of-the-art.
  • Accepted to NIPS 2014 and posted on arxiv (pdf)
  • For comparison with our model we've released our LSP and FLIC predictions
  • We've also released the FLIC-plus dataset

Body Tracking (ACCV 2014)



  • We investigated the use of motion features when training ConvNet architectures.
  • For ambiguous poses with poor image evidence (such as detecting the pose of camouflaged actors), we showed that motion flow features allow us to outperform state-of-the-art techniques.
  • Accepted to ACCV 2014 and posted on arxiv (pdf)

Slow-feature Auto-encoder (NIPS 2014 Workshop)



  • We presented a sparse auto-encoder architecture to make use of temporal coherence. This formulation enables pre-training on unlabeled video data (of which there is a massive abundance), to improve ConvNet performance.
  • Submitted to major conference (decision pending) also accepted to NIPS 2014 workshop (pdf).

Body Tracking (ICLR 2014)



  • It was a new architecture for human pose estimation using a ConvNet + MRF spatial model.
  • First paper to show that a variation of deep learning could outperform existing architectures.
  • Accepted to ICLR 2014 (pdf)

Hand Tracking (SIGGRAPH 2014)



  • A novel method for real-time pose recovery of markerless complex articulable objects from a single depth image. We showed state-of-the-art results for real-time hand tracking.
  • Accepted to TOG and presented at SIGGRAPH'14 (pdf) (ppt)
  • Dataset is now public!
  • Offline fitting code is now public!

Distributed Locking Protocol (Summer Internship)



  • Worked at MongoDB Inc with the server kernel team (under Alberto Lerner).
  • I developed a new distributed lease protocol (for the sharding config server) using a heavily modified 2-phase commit with timeout mechanism.

Randomized Decision Forests (Early Hand-Tracking Research)



ARCADE (SIGGRAPH Realtime Live 2012)



  • ARCADE was a system that allowed real-time video-based presentations that convey the illusion that presenters are directly manipulating holographic 3D objects with their hands.
  • Group project with the MIT Media Lab and NYU Media Research Lab: Jonathan Tompson, Ken Perlin, Murphy Stein, Charlie Hendee, Xiao Xiao Hiroshi Ishii.
  • The content in the above video was presented at “SIGGRAPH 2012 - Real-Time Live!”.

Mesh Decimation



PRenderer - A Pretty Renderer



  • DirectX 9c & OpenGL 4.2 deferred rendering engine with the following features:
    • Compact G-Buffer encoding
    • Light volume optimizations
    • Soft shadows using PSVSM
    • Screen space ambient occlusion
    • Motion blur
    • HDR Rendering pipeline
    • Depth of Field

XNA 3.1 Game - Hungry Bee!



  • Take control of "Bumble the bee" and rescue all his bee-friends!
  • Custom 3D, impulse-based physics engine with RK4 integrator.
  • Vertex and Fragment Shaders for cell-shaded cartoon effects and post-processing.
  • Game settings and level design implemented with document-driven programming for easy editing without re-compilation.
  • Art and music assets come from various open-source game-dev resource sites.
SVN Repository & Source: code.google.com/p/hungrybee/

OBB Tree Collision Detection



  • A physics engine to showcase a practical implementation of the paper: Gottschalk et. al., "OBBTree: A Hierarchical Structure for Rapid Interference Detection" (SIGGRAPH '96).
  • RK4 integrator with full 3D RBO simulation implemented.
  • Performs offline covariance-based OBB fitting of general polygonal soups. Including convex hull generation for statistically optimal OBB axes.
  • Runtime collision detection of OBB nodes using Gottschalk's algorithm based on the separating axis theorem. O(log(n)) algorithm and optimized code suitable for realtime applications.
  • 3D models and background from 3rd party sources.
SVN Repository & Source: code.google.com/p/obbdetection/

Cloth Simulation Engine



  • Deformable object engine based on the SIGGRAPH paper: “large steps in cloth simulation”.
  • Optimized Baraff and Witkin’s algorithm by simplifying shear and bend force calculations, while maintaining the ODE complexity for damped harmonic oscillations between cloth vertices.
  • Engine achieves good stability for unconstrained cloth simulations.
SVN Repository & Source: code.google.com/p/clothsim/

3D Sound Rendering

 
  • Created a real-time sound renderer to synthesize 3D acoustic cues from 2D sources.
  • This work was loosely inspired by the paper: C.P. Brown et. al., "A Structural Model for Binaural Sound Synthesis", IEEE Transactions on Speech and Audio Processing (1998).

Software Administrator at Epoch Microelectronics

  • Developed programs to aid circuit design and verification (lisp/skill):
    • Custom place-and-route tools
    • Simulation engine add-ons to verify circuit design performance
    • Database management
  • Server maintenance for use in batch-computing environments for simulation suites (Spectre, Columbus, etc).