-
Shanghai Jiao Tong University
- China
- https://pyywill.github.io/
Highlights
- Pro
Stars
Planning from Imagination: Episodic Simulation and Episodic Memory for Vision-and-Language Navigation
A collection of MARL benchmarks based on TorchRL
A generative world for general-purpose robotics & embodied AI learning.
[AAAI-25 Oral] Official Implementation of "FLAME: Learning to Navigate with Multimodal LLM in Urban Environments"
[ICCV 2023} Official repo of "BEVBert: Multimodal Map Pre-training for Language-guided Navigation"
Collection of advice for prospective and current PhD students
Code for reproducing the results of NeurIPS 2020 paper "MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation”
We proposed to explore and search for the target in unknown environment based on Large Language Model for multi-robot system.
Latex code for making neural networks diagrams
Providing codes (including Matlab and Python) for visualizing numerical experiment results.
Training code of waypoint predictor in Discrete-to-Continuous VLN.
Official implementation of the ECCV 2022 Oral paper: Sim-2-Sim Transfer for Vision-and-Language Navigation in Continuous Environments
Intelligent Vehicle Competition (SJTU Intramural)
Matterport3D is a pretty awesome dataset for RGB-D machine learning tasks :)
Official implementation of Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigation (CVPR'22 Oral).
List of Research Internships for Undergraduate Students
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments
A curated list for vision-and-language navigation. ACL 2022 paper "Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions"
AI Research Platform for Reinforcement Learning from Real Panoramic Images.
Implementation of "Multimodal Text Style Transfer for Outdoor Vision-and-Language Navigation"
Code for ORAR Agent for Vision and Language Navigation on Touchdown and map2seq