+
+
+
+
+
+
diff --git a/CNAME b/CNAME
new file mode 100644
index 0000000..961ee09
--- /dev/null
+++ b/CNAME
@@ -0,0 +1 @@
+liralab.usc.edu
\ No newline at end of file
diff --git a/LICENSE.txt b/LICENSE.txt
new file mode 100644
index 0000000..90e3b35
--- /dev/null
+++ b/LICENSE.txt
@@ -0,0 +1,25 @@
+Copyright (c) 2014-2018 John Otander
+Copyright (c) 2014 Daniel Eden for animate.css
+Copyright (c) 2014 Brent Jackson for Basscss
+Copyright (c) 2013 Twitter, Inc for CSS copied from Bootstrap
+
+MIT License
+
+Permission is hereby granted, free of charge, to any person obtaining
+a copy of this software and associated documentation files (the
+"Software"), to deal in the Software without restriction, including
+without limitation the rights to use, copy, modify, merge, publish,
+distribute, sublicense, and/or sell copies of the Software, and to
+permit persons to whom the Software is furnished to do so, subject to
+the following conditions:
+
+The above copyright notice and this permission notice shall be
+included in all copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
+LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
+OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
+WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..efb0269
--- /dev/null
+++ b/README.md
@@ -0,0 +1,21 @@
+# USC Lira Lab
+SaFoLab's website, hosted on GitHub Pages!
+
+## Developing the Website
+To develop, **you must do your work on the `source` branch**. `main` is autogenerated via a `rake` job (you can see the details in `Rakefile`). The `source` branch contains all the Jekyll code. To do this, simply execute
+```
+git checkout source
+```
+
+## Publishing the Website
+When you've made the desired changes and are in the `source` branch, simply execute:
+```
+rake publish
+```
+If you want commit with a custom commit message, do:
+```
+rake publish["custom commit message\, and this is how to use a comma"]
+```
+
+
+
diff --git a/Rakefile b/Rakefile
new file mode 100644
index 0000000..8552626
--- /dev/null
+++ b/Rakefile
@@ -0,0 +1,43 @@
+desc "Build _site/ for production"
+task :build do
+ puts "\n## Building Jekyll to _site/"
+ status = system("JEKYLL_ENV=production bundle exec jekyll build")
+ puts status ? "Success" : "Failed"
+end
+
+desc "Commit _site/"
+task :commit, [:commit_name] do |t, args|
+ puts "\n## Staging modified files"
+ status = system("git add -A")
+ puts status ? "Success" : "Failed"
+ puts "\n## Committing a site build at #{Time.now.utc}"
+ message = "Build site at #{Time.now.utc}. " + args[:commit_name].to_s
+ status = system("git commit -m \"#{message}\"")
+ puts status ? "Success" : "Failed"
+ puts "\n## Pushing commits to remote"
+ status = system("git push origin source")
+ puts status ? "Success" : "Failed"
+end
+
+desc "Deploy _site/ to main branch"
+task :deploy do
+ puts "\n## Deleting main branch"
+ status = system("git branch -D main")
+ puts status ? "Success" : "Failed"
+ puts "\n## Creating new main branch and switching to it"
+ status = system("git checkout -b main")
+ puts status ? "Success" : "Failed"
+ puts "\n## Forcing the _site subdirectory to be project root"
+ status = system("git filter-branch --subdirectory-filter _site/ -f")
+ puts status ? "Success" : "Failed"
+ puts "\n## Switching back to source branch"
+ status = system("git checkout source")
+ puts status ? "Success" : "Failed"
+ puts "\n## Pushing all branches to origin"
+ status = system("git push --all origin --force")
+ puts status ? "Success" : "Failed"
+end
+
+desc "Commit and deploy _site/"
+task :publish, [:commit_name] => [:build, :commit, :deploy] do
+end
diff --git a/about/index.html b/about/index.html
new file mode 100644
index 0000000..a06e81d
--- /dev/null
+++ b/about/index.html
@@ -0,0 +1,371 @@
+
+
+
+
+
+
+
+
+
+
+ About Pixyll – SaFoLab-WISC
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
Robot learning is an interdisciplinary field at the intersection of robotics, machine learning, cognitive science, and control theory, aiming to create intelligent and adaptable robotic systems capable of learning from their environment and experience. With rapid advances in artificial intelligence and computing power, as well as the possibility of having larger datasets, robot learning has the potential to revolutionize a wide range of applications, from manufacturing and healthcare to transportation and personal assistance. However, developing learning algorithms for real-world robotic systems poses unique challenges due to the complexities of the physical world, safety concerns, and the need for efficient and robust learning methods.
+
This course provides a comprehensive introduction to the fundamentals of robot learning, covering topics such as reinforcement learning, computer vision, meta-learning, sim-to-real transfer, and multi-agent learning. Students will explore cutting-edge techniques in imitation learning, inverse reinforcement learning, representation learning, and safe and robust learning, while also discussing the real-world applications and challenges of robot learning. The course is designed to be accessible to PhD students in robotics, control theory, machine learning, artificial intelligence, optimization, and related fields; with an emphasis on both theoretical foundations and practical applications.
+
In addition to lectures, the course features a series of student-led presentations on recent research papers and a course project, allowing students to gain hands-on experience with the latest advances in robot learning and explore emerging research topics. Through a combination of lectures, homework assignments, presentations, and project work, students will develop a deep understanding of robot learning techniques and their potential to transform the way we interact with and utilize robots in our everyday lives.
+
+
Prerequisites:
+
Students are recommended to have familiarity with fundamental concepts in machine learning. CSCI 467: Introduction to Machine Learning, and CSCI 445L: Introduction to Robotics are recommended but not required.
Homework (45%): Students will be assigned three homework sets that consist of both report questions and
+programming questions (in Python). Report questions will require students to work on problems related to
+past lectures with pen and paper. Programming questions will require students to implement some of the
+methods covered in the lectures, occasionally with further improvements, and experiment them on
+simulated robot environments and/or machine learning tasks.
+
Homework reports and codes will be submitted online. Students will have a total of 8 free late days that may be used for the homework assignments; a maximum of 4 late days will be allowed on a given assignment. Late days are only for homework assignments, and cannot be used for the class presentation or deadlines related to the course project.
+
+
Class Presentation (15%): Students will present research papers from literature. Presentations will be followed
+by open discussions. Students will be graded based on their presentations.
+
+
In case the class size is too large to have every student make a presentation, some students will be required
+ to write a short review of the papers that will be presented in class. These short reviews will be due on the beginning of the class.
+
+
Course Project (40%): Students will be required to work on a course project in groups of 2-3. The
+projects must have both robotics and machine learning components. They can be, for example, application-dependent
+improvements over an existing robot learning method, a novel robot learning related
+application of an existing technique, or a completely new method that may have potential benefits.
+Students will write a 2-page project proposal, present their findings in an oral presentation, write a conference
+paper-style 6-8 pages project report, and write an anonymous peer review (max 1 page) for the project report of another group.
+There will be a 2-page project milestone along the way to guide progress. Instructor and teaching assistant(s)
+will provide feedback on the project milestone.
+
+
+
+
+
Other References
+ This class is partially based on the following existing courses:
+
+In this experiment, you will be playing a simple computer game where you control a lunar lander spacecraft.
+
+The game interface is shown on the right. It is important that you do NOT refresh or close the page. Otherwise, the experiment will remain incomplete and you will not have a chance to continue.
+
+
+ Goal. The game consists of two phases. In the first phase, you just need to stabilize the lander, i.e., keep it upright. The goal of the second phase is to safely and quickly land the spacecraft on the ground, as close as possible to the landing pad between the two flags. A landing is safe if the main body of the spacecraft does not hit the ground. Examples of successful and unsuccessful landing attempts are shown and explained below.
+
+
+
+
+
+
+
+After each round a score will appear on the right. You will get a green check for a successful stabilization (in the first phase) or landing (in the second phase) and a red cross for an unsuccessful one. On average, scores close to 200 and above indicate success.
+
+
+Even if the lander crashes or goes out of the field of view, you may still get slightly better scores if the lander is close to the landing pad or the collision is mild.
+
+
+ Control Input. You will be using three arrow keys to control the lander. At any time, at most one control input is active, i.e., you are not allowed to fire multiple engines at the same time. If you press more than one key, only the last one will be active. The figure below shows these controls.
+
+
+
+
+
+
+
+ Auto Pilot. Finally, we have an auto pilot implemented. The auto pilot takes over the control completely once it is active, but it becomes active only automatically, so you do not have control over when it will be on or off. You can see the state of the auto pilot (on or off) at the top right corner of the game screen (see the figures below).
+
+
+
+
+
+
+You will be playing the game for 20 rounds. In some of the rounds, auto pilot can take over the control often and in some it may not interfere at all. It is even possible that auto-pilot be off during all 20 rounds. In the rounds when the auto pilot occasionally becomes active, it becomes active only when you provide some suboptimal control inputs to correct your actions. Regardless of the auto pilot's being on or off, you MUST keep providing control inputs to the lander.
+
+
+At the end of some rounds, auto pilot may demonstrate you why it interfered by showing you some replays. This is a chance to improve your skills controlling the lander. Specifically, the auto pilot is going to tell you a bad sequence of 3 actions you took, and show you what would happen if it did not interfere. Next, it is going to show you an ideal trajectory that follows the 3 actions it recommends. An example of replays is shown in the figures below.
+
+
+
+
+
+
+The first 7 rounds are the first phase, where the goal is to keep the lander upright. You will have 16 seconds in each round in the first phase. In the second phase (13 rounds), you will have 48 seconds to land the space craft in each round. The round automatically ends when the attempt is complete, either successfully or unsuccessfully.
+
+
+You will be compensated \$3.75 for this study. We are compensating at the rate posted on Amazon Mechanical Turk, i.e. \$0.25 per minute (\$15/hour). Payment is arranged via Amazon Mechanical Turk.
+We will be offering bonuses for workers who score highly on the games. Workers will have a chance to earn up to \$6.25 total. So make sure you try your best to do well on these games!
+Now, please complete the 2-question quiz below to start the experiment.
+
+
+
+
What does it mean if the "Auto Pilot" indicator is green (active)?
+
+
+
+
+
+
+
+ The Safe And secure Foundation mOdel systems Lab (SaFoLab) at the University of Wisconsin, Madison, led by Professor Chaowei Xiao, is dedicated to pioneering research in trustworthy (MultiModal) Large Language Model Systems.
+ Our mission is to develop robust and secure AI systems that can be trusted across various application domains.
+
+
+
+
Recent News
+ Check out our github group for our latest projects and publications.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
2024-08:
+
We got USENIX Security Dishtingushed Paper Award.
+
+
+
+
+
+
2024-07:
+
4/4 papers are accepted to ECCV on the topic of trustworthy VLM and driving. Two of them are from interns in my group.
+
+
+
+
+
+
2024-06:
+
Prof. Chaowei Xiao will give a talk to discuss recent progress on the Security in era of Vision Large Language Models at CVPR.
+
+
+
+
+
+
2024-06:
+
Prof. Chaowei Xiao will give a talk to discuss recent progress on the Security in era of Large Language Models at NAACL.
+
+
+
+
+
+
2024-05:
+
Prof. Chaowei Xiao will give a talk to discuss recent progress on the Security in era of Large Language Models at ICLR.
+
+
+
+
+
+
2024-05:
+
Our jailbreak paper is accepted to USENIX Security. Congratulations, Zhiyuan!
+
+
+
+
+
+
2024-03:
+
Five papers at NAACL on LLM security (4 main and 1 finding): two on the backdoor attack, one on backdoor defense, one on jailbreak attacks, and one on model fingerprint. Stay tuned on these exciting fields.
+
+
+
+
+
+
2024-03:
+
PreDa for personalized federated learning is accepted at CVPR 2024.
Our paper MoleculeSTM has been accepted to Nature Machine Intelligence. MoleculeSTM aims to align the nature language and molecule representation into the same representation space.
+
+
+
+
+
+
2023-10:
+
Three papers at EMNLP and one paper at NeurIPS. For our NeurIPS paper, we study a new threat of the instruction tuning of LLMs by injecting the Ads. This is the first work that views the LLMs as the generative model and aims to attack the generative property of LLMs.
+
+
+
+
+
+
2023-10:
+
Our tutorial on Security and Privacy in the Era of Large Language Models is accepted to NAACL.
+
+
+
+
+
+
2023-05:
+
One paper at ACL. Congratulations to zhuofeng and jiazhao. We propose an attention-based method to defend against NLP backdoor attacks
+
+
+
+
+
+
2023-04:
+
Two papers at ICML. Congratulations to Jiachen and Zhiyuan. We propose the first benchmark for code copyright of code generation models.
+
+
+
+
+
+
2023-02:
+
Two papers at CVPR. Congratulations to Yiming and Xiaogeng. Xiaogeng is an intern from my group at ASU.
+
+
+
+
+
+
2023-02:
+
I will give a tutorial at CVPR 2023 on the topic of trustworthiness in the era of Foundation Models. Stay tuned!
+
+
+
+
+
+
2023-01:
+
Impact Award from Argonne National Laboratory.
+
+
+
+
+
+
2023-01:
+
One paper got accepted to USENIX Security 2023.
+
+
+
+
+
+
2023-01:
+
Three papers are accepted to ICLR 2023 [a]: We explain why and how to use diffusion model to improve adversarial robustness and design DensePure which leverages the pretrained diffusion model and classifier to provide the state-of-the-art certified robustness. [b]:This is our first attemp on retrieval-based framework and AI for drug discovery. We will recently release more work in this research line. Stay tuned!
[12/2022] Our team won the ACM Gordon Bell Special Prize for COVID-19 Research.
+
+
+
+
+
+
2022-09:
+
One papers got accepted to USENIX Security 2023.
+
+
+
+
+
+
2022-09:
+
Two papers got accepted to NeurIPS 2022.
+
+
+
+
+
+
2022-09:
+
Our paper RobustTraj has been accepted to CORL for oral presentations. We explore to train a robust Trajectory Prediction Model against adversarial attacks.
+ Check out our YouTube channel for latest talks and supplementary videos for our publications.
+
+
+
+
+
+
2024-08: We got USENIX Security Dishtingushed Paper Award.
+
+
2024-07: 4/4 papers are accepted to ECCV on the topic of trustworthy VLM and driving. Two of them are from interns in my group.
+
+
2024-06: Prof. Chaowei Xiao will give a talk to discuss recent progress on the Security in era of Vision Large Language Models at CVPR.
+
+
2024-06: Prof. Chaowei Xiao will give a talk to discuss recent progress on the Security in era of Large Language Models at NAACL.
+
+
2024-05: Prof. Chaowei Xiao will give a talk to discuss recent progress on the Security in era of Large Language Models at ICLR.
+
+
2024-05: Our jailbreak paper is accepted to USENIX Security. Congratulations, Zhiyuan!
+
+
2024-03: Five papers at NAACL on LLM security (4 main and 1 finding): two on the backdoor attack, one on backdoor defense, one on jailbreak attacks, and one on model fingerprint. Stay tuned on these exciting fields.
+
+
2024-03: PreDa for personalized federated learning is accepted at CVPR 2024.
+
+
2024-01: Three papers at ICLR.
+
+
2024-01: Two papers at TMLR
+
+
2023-12: Invited Talk at NeurIPS TDW workshop
+
+
2023-10: Our paper MoleculeSTM has been accepted to Nature Machine Intelligence. MoleculeSTM aims to align the nature language and molecule representation into the same representation space.
+
+
2023-10: Three papers at EMNLP and one paper at NeurIPS. For our NeurIPS paper, we study a new threat of the instruction tuning of LLMs by injecting the Ads. This is the first work that views the LLMs as the generative model and aims to attack the generative property of LLMs.
+
+
2023-10: Our tutorial on Security and Privacy in the Era of Large Language Models is accepted to NAACL.
+
+
2023-05: One paper at ACL. Congratulations to zhuofeng and jiazhao. We propose an attention-based method to defend against NLP backdoor attacks
+
+
2023-04: Two papers at ICML. Congratulations to Jiachen and Zhiyuan. We propose the first benchmark for code copyright of code generation models.
+
+
2023-02: Two papers at CVPR. Congratulations to Yiming and Xiaogeng. Xiaogeng is an intern from my group at ASU.
+
+
2023-02: I will give a tutorial at CVPR 2023 on the topic of trustworthiness in the era of Foundation Models. Stay tuned!
+
+
2023-01: Impact Award from Argonne National Laboratory.
+
+
2023-01: One paper got accepted to USENIX Security 2023.
+
+
2023-01: Three papers are accepted to ICLR 2023 [a]: We explain why and how to use diffusion model to improve adversarial robustness and design DensePure which leverages the pretrained diffusion model and classifier to provide the state-of-the-art certified robustness. [b]:This is our first attemp on retrieval-based framework and AI for drug discovery. We will recently release more work in this research line. Stay tuned!
+
+
2022-12: [12/2022] Our team won the ACM Gordon Bell Special Prize for COVID-19 Research.
+
+
2022-09: One papers got accepted to USENIX Security 2023.
+
+
2022-09: Two papers got accepted to NeurIPS 2022.
+
+
2022-09: Our paper RobustTraj has been accepted to CORL for oral presentations. We explore to train a robust Trajectory Prediction Model against adversarial attacks.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/pdfs/publications/basu2019active.pdf b/pdfs/publications/basu2019active.pdf
new file mode 100644
index 0000000..081c8e4
Binary files /dev/null and b/pdfs/publications/basu2019active.pdf differ
diff --git a/pdfs/publications/baykara2017real.pdf b/pdfs/publications/baykara2017real.pdf
new file mode 100644
index 0000000..786f699
Binary files /dev/null and b/pdfs/publications/baykara2017real.pdf differ
diff --git a/pdfs/publications/beliaev2020emergent.pdf b/pdfs/publications/beliaev2020emergent.pdf
new file mode 100644
index 0000000..940c5e5
Binary files /dev/null and b/pdfs/publications/beliaev2020emergent.pdf differ
diff --git a/pdfs/publications/beliaev2021incentivizing.pdf b/pdfs/publications/beliaev2021incentivizing.pdf
new file mode 100644
index 0000000..eff828c
Binary files /dev/null and b/pdfs/publications/beliaev2021incentivizing.pdf differ
diff --git a/pdfs/publications/biyik2018altruistic.pdf b/pdfs/publications/biyik2018altruistic.pdf
new file mode 100644
index 0000000..e16c989
Binary files /dev/null and b/pdfs/publications/biyik2018altruistic.pdf differ
diff --git a/pdfs/publications/biyik2018batch.pdf b/pdfs/publications/biyik2018batch.pdf
new file mode 100644
index 0000000..5c72f8a
Binary files /dev/null and b/pdfs/publications/biyik2018batch.pdf differ
diff --git a/pdfs/publications/biyik2019asking.pdf b/pdfs/publications/biyik2019asking.pdf
new file mode 100644
index 0000000..8445dce
Binary files /dev/null and b/pdfs/publications/biyik2019asking.pdf differ
diff --git a/pdfs/publications/biyik2019efficient.pdf b/pdfs/publications/biyik2019efficient.pdf
new file mode 100644
index 0000000..50f4c7f
Binary files /dev/null and b/pdfs/publications/biyik2019efficient.pdf differ
diff --git a/pdfs/publications/biyik2019green.pdf b/pdfs/publications/biyik2019green.pdf
new file mode 100644
index 0000000..7fd3b0f
Binary files /dev/null and b/pdfs/publications/biyik2019green.pdf differ
diff --git a/pdfs/publications/biyik2020active.pdf b/pdfs/publications/biyik2020active.pdf
new file mode 100644
index 0000000..597d75d
Binary files /dev/null and b/pdfs/publications/biyik2020active.pdf differ
diff --git a/pdfs/publications/biyik2021incentivizing.pdf b/pdfs/publications/biyik2021incentivizing.pdf
new file mode 100644
index 0000000..9c7edb1
Binary files /dev/null and b/pdfs/publications/biyik2021incentivizing.pdf differ
diff --git a/pdfs/publications/biyik2021learning.pdf b/pdfs/publications/biyik2021learning.pdf
new file mode 100644
index 0000000..b3b318e
Binary files /dev/null and b/pdfs/publications/biyik2021learning.pdf differ
diff --git a/pdfs/publications/biyik2022aprel.pdf b/pdfs/publications/biyik2022aprel.pdf
new file mode 100644
index 0000000..183f4dd
Binary files /dev/null and b/pdfs/publications/biyik2022aprel.pdf differ
diff --git a/pdfs/publications/biyik2022aprel_workshop.pdf b/pdfs/publications/biyik2022aprel_workshop.pdf
new file mode 100644
index 0000000..3dfe5f6
Binary files /dev/null and b/pdfs/publications/biyik2022aprel_workshop.pdf differ
diff --git a/pdfs/publications/biyik2022learning.pdf b/pdfs/publications/biyik2022learning.pdf
new file mode 100644
index 0000000..4c06557
Binary files /dev/null and b/pdfs/publications/biyik2022learning.pdf differ
diff --git a/pdfs/publications/biyik2022partner.pdf b/pdfs/publications/biyik2022partner.pdf
new file mode 100644
index 0000000..c0b0e36
Binary files /dev/null and b/pdfs/publications/biyik2022partner.pdf differ
diff --git a/pdfs/publications/biyik2022partner_workshop.pdf b/pdfs/publications/biyik2022partner_workshop.pdf
new file mode 100644
index 0000000..57490b5
Binary files /dev/null and b/pdfs/publications/biyik2022partner_workshop.pdf differ
diff --git a/pdfs/publications/biyik2022pioneers.pdf b/pdfs/publications/biyik2022pioneers.pdf
new file mode 100644
index 0000000..6bcf6c2
Binary files /dev/null and b/pdfs/publications/biyik2022pioneers.pdf differ
diff --git a/pdfs/publications/biyik2023preference.pdf b/pdfs/publications/biyik2023preference.pdf
new file mode 100644
index 0000000..fb7cd6b
Binary files /dev/null and b/pdfs/publications/biyik2023preference.pdf differ
diff --git a/pdfs/publications/biyik2024active.pdf b/pdfs/publications/biyik2024active.pdf
new file mode 100644
index 0000000..a5aefe0
Binary files /dev/null and b/pdfs/publications/biyik2024active.pdf differ
diff --git a/pdfs/publications/biyik2024batch.pdf b/pdfs/publications/biyik2024batch.pdf
new file mode 100644
index 0000000..097652c
Binary files /dev/null and b/pdfs/publications/biyik2024batch.pdf differ
diff --git a/pdfs/publications/brockbank2022people.pdf b/pdfs/publications/brockbank2022people.pdf
new file mode 100644
index 0000000..729f5c4
Binary files /dev/null and b/pdfs/publications/brockbank2022people.pdf differ
diff --git a/pdfs/publications/cao2020reinforcement.pdf b/pdfs/publications/cao2020reinforcement.pdf
new file mode 100644
index 0000000..5e832fe
Binary files /dev/null and b/pdfs/publications/cao2020reinforcement.pdf differ
diff --git a/pdfs/publications/cao2022leveraging.pdf b/pdfs/publications/cao2022leveraging.pdf
new file mode 100644
index 0000000..f41ea55
Binary files /dev/null and b/pdfs/publications/cao2022leveraging.pdf differ
diff --git a/pdfs/publications/casper2023open.pdf b/pdfs/publications/casper2023open.pdf
new file mode 100644
index 0000000..96dd3a5
Binary files /dev/null and b/pdfs/publications/casper2023open.pdf differ
diff --git a/pdfs/publications/ellis2024generalized.pdf b/pdfs/publications/ellis2024generalized.pdf
new file mode 100644
index 0000000..709e91c
Binary files /dev/null and b/pdfs/publications/ellis2024generalized.pdf differ
diff --git a/pdfs/publications/kwon2020when.pdf b/pdfs/publications/kwon2020when.pdf
new file mode 100644
index 0000000..06ba0de
Binary files /dev/null and b/pdfs/publications/kwon2020when.pdf differ
diff --git a/pdfs/publications/kwon2021when_workshop.pdf b/pdfs/publications/kwon2021when_workshop.pdf
new file mode 100644
index 0000000..e37c114
Binary files /dev/null and b/pdfs/publications/kwon2021when_workshop.pdf differ
diff --git a/pdfs/publications/lazar2021learning.pdf b/pdfs/publications/lazar2021learning.pdf
new file mode 100644
index 0000000..da702ca
Binary files /dev/null and b/pdfs/publications/lazar2021learning.pdf differ
diff --git a/pdfs/publications/li2021roial.pdf b/pdfs/publications/li2021roial.pdf
new file mode 100644
index 0000000..1cd3ef5
Binary files /dev/null and b/pdfs/publications/li2021roial.pdf differ
diff --git a/pdfs/publications/liang2023visarl.pdf b/pdfs/publications/liang2023visarl.pdf
new file mode 100644
index 0000000..16c4a24
Binary files /dev/null and b/pdfs/publications/liang2023visarl.pdf differ
diff --git a/pdfs/publications/liang2024dynamiterl.pdf b/pdfs/publications/liang2024dynamiterl.pdf
new file mode 100644
index 0000000..63c77bf
Binary files /dev/null and b/pdfs/publications/liang2024dynamiterl.pdf differ
diff --git a/pdfs/publications/myers2021learning.pdf b/pdfs/publications/myers2021learning.pdf
new file mode 100644
index 0000000..79c5254
Binary files /dev/null and b/pdfs/publications/myers2021learning.pdf differ
diff --git a/pdfs/publications/myers2023active.pdf b/pdfs/publications/myers2023active.pdf
new file mode 100644
index 0000000..c647e49
Binary files /dev/null and b/pdfs/publications/myers2023active.pdf differ
diff --git a/pdfs/publications/pan2024coprocessor.pdf b/pdfs/publications/pan2024coprocessor.pdf
new file mode 100644
index 0000000..980320a
Binary files /dev/null and b/pdfs/publications/pan2024coprocessor.pdf differ
diff --git a/pdfs/publications/sontakke2023roboclip.pdf b/pdfs/publications/sontakke2023roboclip.pdf
new file mode 100644
index 0000000..0a35359
Binary files /dev/null and b/pdfs/publications/sontakke2023roboclip.pdf differ
diff --git a/pdfs/publications/sontakke2024foundation.pdf b/pdfs/publications/sontakke2024foundation.pdf
new file mode 100644
index 0000000..0314e31
Binary files /dev/null and b/pdfs/publications/sontakke2024foundation.pdf differ
diff --git a/pdfs/publications/srivastava2022assistive.pdf b/pdfs/publications/srivastava2022assistive.pdf
new file mode 100644
index 0000000..1d62d89
Binary files /dev/null and b/pdfs/publications/srivastava2022assistive.pdf differ
diff --git a/pdfs/publications/tien2024optimizing.pdf b/pdfs/publications/tien2024optimizing.pdf
new file mode 100644
index 0000000..52ea40a
Binary files /dev/null and b/pdfs/publications/tien2024optimizing.pdf differ
diff --git a/pdfs/publications/wang2021emergent.pdf b/pdfs/publications/wang2021emergent.pdf
new file mode 100644
index 0000000..29c16ad
Binary files /dev/null and b/pdfs/publications/wang2021emergent.pdf differ
diff --git a/pdfs/publications/wang2024rlvlmf.pdf b/pdfs/publications/wang2024rlvlmf.pdf
new file mode 100644
index 0000000..233ff62
Binary files /dev/null and b/pdfs/publications/wang2024rlvlmf.pdf differ
diff --git a/pdfs/publications/wilde2021learning.pdf b/pdfs/publications/wilde2021learning.pdf
new file mode 100644
index 0000000..ce670e5
Binary files /dev/null and b/pdfs/publications/wilde2021learning.pdf differ
diff --git a/pdfs/publications/zhu2020multi.pdf b/pdfs/publications/zhu2020multi.pdf
new file mode 100644
index 0000000..3dfb6e9
Binary files /dev/null and b/pdfs/publications/zhu2020multi.pdf differ
diff --git a/people/index.html b/people/index.html
new file mode 100644
index 0000000..2bef705
--- /dev/null
+++ b/people/index.html
@@ -0,0 +1,433 @@
+
+
+
+
+
+
+
+
+
+
+ People – SaFoLab-WISC
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
People
+
+
+
+
+
+
+
+
Faculty Members
+
+
+
+
+
+
+
+
Chaowei Xiao
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Chaowei is an assistant professor at the University of Wisconsin, Madison, and also a research scientist at NVIDIA.
+
+
+
+ A new algorithm developed by USC computer science researchers shows that robots can, in computer simulations, learn tasks after a single demonstration
+
+
+
+
+
+ In recent years, researchers have been trying to develop methods that enable robots to learn new skills. One option is for a robot to learn these new skills from humans, asking questions whenever it is unsure about how to behave, and learning from the human user's responses.
+
+
+
+ SecretGen: Privacy Recovery on Pre-trained Models
+ Zhuowen Yuan, Fan Wu, Yunhui Long, Chaowei Xiao, and Bo Li
+ ECCV 2022
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Taxonomy of Machine Learning Safety: A Survey and Primer
+ Sina Mohseni, Zhiding Yu, Chaowei Xiao, and Jay Yadawa, Haotao Wang, and Zhangyang Wang
+ ACM Computing Survey
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ AdvIT: Characterizing Adversarial Frames in Videos Based on Temporal Information
+ Chaowei Xiao, Ruizhi Deng, Bo Li, Taesung Lee, Benjamin Edwards, Jinfeng Yi, Dawn Song, Mingyan Liu, Ian Molloy
+ ICCV 2019
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Adversarial Sensor Attack on LIDAR-based Perception in Autonomous Driving
+ Yulong Cao, Chaowei Xiao, Benjamin Cyr, Yimeng Zhou, Won Park, Sara Rampazzi, Qi Alfred Chen, Kevin Fu, Z. Morley Mao
+ CCS 2019
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Improving Robustness of ML Classifiers against Realizable Evasion Attacks Using Conserved Features
+ Liang Tong, Bo Li, Chen Hajaj, Chaowei Xiao, Ning Zhang, Yevgeniy Vorobeychik
+ USENIX Security 2019
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Performing Co-Membership Attacks Against Deep Generative Models
+ Kin Sum Liu, Chaowei Xiao, Bo Li, Jie Gao
+ ICDM 2019
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Characterize Adversarial Examples Based on Spatial Consistency Information for Semantic Segmentation
+ Chaowei Xiao, Ruizhi Deng, Bo Li, Fisher Yu, Mingyan Liu, Dawn Song
+ ECCV 2018
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Spatially Transformed Adversarial Examples
+ Chaowei Xiao*, Jun-Yan Zhu*, Bo Li, Warren He, Mingyan Liu and Dawn Song
+ ICLR, 2018
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ * denotes equal contribution
+
+
+
+
+
+
+
+
+
+
+
+
+ Generating Adversarial Examples with Adversarial Networks
+ Chaowei Xiao, Bo Li, Jun-Yan Zhu, Warren He, Mingyan Liu and Dawn Song
+ IJCAI, 2018.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ From Patching Delays to Infection Symptoms: Using Risk Profiles for an Early Discovery of Vulnerabilities Exploited in the Wild
+ Chaowei Xiao, Armin Sarabi, Yang Liu, Bo Li, Tudor Dumitra, Mingyan Liu
+ Usenix Security 2018
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Robust Physical-World Attacks on Machine Learning Models
+ Kevin Eykholt*, Ivan Evtimov*, Earlence Fernandes, Bo Li, Amir Rahmati, Chaowei Xiao, Atul Prakash, Tadayoshi Kohno and Dawn Song
+ CVPR, 2018
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Static Power of Mobile Devices: Self-updating Radio Maps for Wireless Indoor Localization
+ Chenshu Wu, Zheng Yang, Chaowei Xiao, Chaofan Yang, Yunhao Liu, Mingyan Liu
+ INFOCOM 2015
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+ Tagoram: Real-time Tracking of Mobile RFID Tags to High Precision Using COTS Devices
+ Lei Yang, Yekui Chen, Xiangyang Li, Chaowei Xiao, Mo Li, Yunhao Liu
+ MobiCom 2014 (Best Paper Award)
+
+
+
+
+
+
+
+ The field of robotics has made significant advances over the past few decades, and we can finally start thinking about intricate and long-term interactions with humans and the environment.
+The question that still remains is how and what we can learn from these intricate interactions.
+
+
+
+In ILIAD, we are interested in two core objectives: 1) efficiently learning computational models of human behavior (reward functions or policies) from diverse sources of interaction data, and 2) learning effective robot policies from interaction data.
+This introduces a set of research challenges including but not limited to:
+
+
How can we actively and efficiently collect data in a low data regime setting such as in interactive robotics?
+
+
How can we tap into different sources and modalities --- perfect and imperfect demonstrations, comparison and ranking queries, physical feedback, language instructions, videos --- to learn an effective human model or robot policy?
+
+
+ What inductive biases and priors can help with effectively learning from human/interaction data?
+
+
+
+
+
+
+
+
+
+
Active Learning of Reward Functions
+
+
+ Human preferences play a key role in specifying how robotics systems should act, i.e., how an assistive robot arm should move, or how an autonomous car should drive. However, a significant part of the success of reward learning algorithms can be attributed to the availability of large amounts of labeled data. Unfortunately, collecting and labeling data can be costly and time-consuming in most robotics applications. In addition, humans are not always capable of reliably assigning a success value (reward) to a given robot action, and their demonstrations are more often than not suboptimal due to the difficulty of operating robots with more than a few degrees of freedom.
+
+ Our work develops active learning algorithms that efficiently query users for the most informative piece of data
+ [RSS 2017,
+ CoRL 2018,
+ RSS 2019,
+ CoRL 2019,
+ IROS 2019,
+ CDC 2019,
+ RSS 2020,
+ ICRA 2021,
+ ICCPS 2021,
+ AI-HRI 2021,
+ CoRL 2021a,
+ CoRL 2021b].
+
+ We study learning from diverse sources of data from humans, such as optimal and suboptimal demonstrations, pairwise comparisons, best-of-many selections, rankings, scale feedback, physical corrections, and language instructions. We also investigate how to optimally integrate these different sources. Specifically, when learning from both expert demonstrations and active comparison queries, we prove that the optimal integration is to warm-start the reward learning algorithm by learning from expert demonstrations, and fine-tune the model using active preferences. Aside from learning through queries, we also investigate how to understand other sources of human robot interaction, such as physical interactions. By reasoning over human physical corrections, robots can adaptively improve their reward estimates.
+
+
+
+
+
Learning from Imperfect Demonstrations
+
+Standard imitation learning algorithms often rely on expert and optimal demonstrations. In practice, it is expensive to obtain a large number of expert data, but we usually have access to a plethora of imperfect demonstrations--suboptimal behavior that can range from random noise or failures to nearly optimal demonstrations.
+ The main challenge of learning from imperfect demonstrations is excavating useful knowledge from this data and avoiding the influence of harmful behaviors. We need to down-weight those noisy or malicious inputs while learning from those nearly optimal state and actions. We develop an approach that assigns a feasibility metric to deal with out-of-dynamics demonstrations and an optimality metric to deal with suboptimal demonstrations [RA-L 2021]. We further demonstrate such metrics can be directly learned from a small number of ranking data, and propose an iterative approach that learns a confidence value over demonstrations and the policy parameters [NeurIPS 2021].
+
+
+
Risk-Aware Human Models
+
+Many of today’s robots model humans as if they are always optimal or noisily rational. Both of these models make sense when the human receives deterministic rewards. But in real world scenarios, rewards are rarely deterministic. Instead, we consider settings, where humans need to make choices subject to risk and uncertainty. In these settings, humans exhibit a cognitive bias towards suboptimal behavior.
+
+
+
+We adopt a well-known Risk-Aware human model from behavioral
+economics called Cumulative Prospect Theory and enable robots
+to leverage this model during human-robot interaction. Our work extends existing
+rational human models so that collaborative robots can anticipate
+and plan around suboptimal human behavior during interaction.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+In a collaborative cup stacking task, a Risk-Aware robot correctly predicts how the human wants to stack cups: it correctly anticipates that the human is overly concerned about the tower falling, and starts to build the less efficient but stable tower. Having the right prediction here prevents the human and robot from reaching for the same cup, so that they more seamlessly collaborate during the task!
+Our work integrates learning techniques along with modeling cognitive biases to anticipate human behavior in risk-sensitive scenarios, and better coordinate and collaborate with humans [RSS 2020b, HRI 2020].
+
+
+
+Incomplete List of Related Publications:
+
+
Vivek Myers, Erdem Bıyık, Nima Anari, Dorsa Sadigh. Learning Multimodal Rewards from Rankings. Proceedings of the 5th Conference on Robot Learning (CoRL), 2021. [PDF]
+
Erdem Bıyık, Dylan P. Losey, Malayandi Palan, Nicholas C. Landolfi, Gleb Shevchuk, Dorsa Sadigh. Learning Reward Functions from Diverse Sources of Human Feedback: Optimally Integrating Demonstrations and Preferences. The International Journal of Robotics Research (IJRR), 2021. [PDF]
+
Songyuan Zhang, Zhangjie Cao, Dorsa Sadigh, Yanan Sui.Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality. Conference on Neural Information Processing Systems (NeurIPS), 2021. [PDF]
+
Mengxi Li, Alper Canberk, Dylan P. Losey, Dorsa Sadigh.Learning Human Objectives from Sequences of Physical Corrections. International Conference on Robotics and Automation (ICRA), 2021.. [PDF]
+
Erdem Bıyık*, Nicolas Huynh*, Mykel J. Kochenderfer, Dorsa Sadigh. Active Preference-Based Gaussian Process Regression for Reward Learning. Proceedings of Robotics: Science and Systems (RSS), July 2020. [PDF]
+
Minae Kwon, Erdem Bıyık, Aditi Talati, Karan Bhasin, Dylan P. Losey, Dorsa Sadigh. When Humans Aren't Optimal: Robots that Collaborate with Risk-Aware Humans. ACM/IEEE International Conference on Human-Robot Interaction (HRI), March 2020. [PDF]
+
Dorsa Sadigh, Anca D. Dragan, S. Shankar Sastry, Sanjit A. Seshia. Active Preference-Based Learning of Reward Functions. Proceedings of Robotics: Science and Systems (RSS), July 2017. [PDF]
+
\ No newline at end of file
diff --git a/research/implications.html b/research/implications.html
new file mode 100644
index 0000000..4189600
--- /dev/null
+++ b/research/implications.html
@@ -0,0 +1,54 @@
+
+Large-scale coordination problems such as driving on highways or negotiating effectively with another agent are ubiquitous. Our ability to develop solutions (sometimes interpreted as norms or equilibria in strategic games) to these decentralized coordination problems have been critical to the development of large-scale societies. Intelligent and autonomous agents need to understand and positively influence these norms.
+
+Under this theme, one of our focuses is on mixed-autonomy traffic networks for the goal of analyzing societal implications of autonomy. We are investigating the effects of autonomous cars on societal objectives such as traffic congestion to see how they can increase traffic flow and reduce delays. More recently, we have studied other mechanisms for influencing to enable reaching socially optimum equilibria beyond the space of driving. We are also interested in mixed-initiative settings, and developing AI agents that can negotiate and coordinate with each other in the presence or absence of a communication channel.
+
+
+
+
+
Influencing Human Driving Policies
+
+In addition to the platooning capabilities of autonomous vehicles, they also have the ability to influence other drivers' behaviors. We develop interaction-level policies for autonomous cars to increase the efficiency of the traffic networks by influencing human drivers and maximizing the gain obtained from platooning. Our work is one of the first to connect micro-level vehicle interactions with macro-level traffic models [CDC 2018, TCNS 2021].
+
+
+
Influencing Routing Policies
+
+
+From a higher-level perspective, it is possible to reduce traffic congestion by carefully optimizing for the autonomy level of the roads. To achieve that, we first analyze different behaviors (Nash Equilibria, Best Nash Equilibria, Robust Best Nash Equilibria) that emerge in mixed-autonomy traffic networks. We introduce the notion of altruistic autonomy that models how a fleet of autonomous vehicles can act altruistically (as opposed to selfishly), by accepting a larger delay. We analyze Best Altruistic Nash Equilibria that can reduce congestion on parallel traffic networks [WAFR 2018, TCNS 2021].
+
+ We further develop reinforcement learning algorithms that optimize autonomous cars' routing choices to make sure all cars will experience the minimum possible latency.
+
+ We generalize our works in both absence and presence of altruistic drivers who can (be made) take longer routes for social good, as well as in the presence of perturbations such as accidents [TR-C].
+
+
+
Emergent Prosociality in Multi-Agent Games
+
+
+In addition to the altruistic agents, we also investigate alternative ways to incentivize multi-agent systems to move from a socially suboptimal Nash equilibrium to a more socially desirable one. We study and analyze gifting, a decentralized peer-rewarding mechanism, in matrix games such as Stag Hunt. We show reinforcement learning agents equipped with gifting actions can reach the prosocial equilibrium more often even when they are completely selfish [IJCAI 2021].
+
+
+
Designing Negotiation Agents
+
+
+
+
+
+Many real-world interactions are mixed-incentive, where agents have partially aligned goals. As AI agents become embedded in society, it is critical they learn to coordinate with their partners to achieve equitable outcomes. Of the skills necessary to do this, negotiation is paramount. Effective negotiators therefore need to optimize for their own self-interest while also being able to compromise where it makes sense for them to do so. We build negotiation agents that improve their capacity to achieve negotiation outcomes that advance their self-interest and are also Pareto-optimal. We accomplish this through targeted data acquisition, using active learning to acquire new data and expand the pool of negotiation examples we train on [ICML 2021].
+
+
+
+
+Incomplete List of Related Publications:
+
+
Erdem Bıyık*, Daniel A. Lazar*, Ramtin Pedarsani, Dorsa Sadigh. Incentivizing Efficient Equilibria in Traffic Networks with Mixed Autonomy. IEEE Transactions on Control of Network Systems (TCNS), 2021. [PDF]
+
Daniel A. Lazar*, Erdem Bıyık*, Dorsa Sadigh, Ramtin Pedarsani. Learning How to Dynamically Route Autonomous Vehicles on Shared Roads. Transportation Research Part C: Emerging Technologies (TR_C), 2021.[PDF]
+
Woodrow Z. Wang*, Mark Beliaev*, Erdem Bıyık*, Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh. Emergent Prosociality in Multi-Agent Games Through Gifting. 30th International Joint Conference on Artificial Intelligence (IJCAI) 2021. [PDF]
+
Minae Kwon, Siddharth Karamcheti, Mariano-Florentino Cuéllar, Dorsa Sadigh. Targeted Data Acquisition for Evolving Negotiation Agents. International Conference on Machine Learning (ICML), 2021. [PDF]
+
Erdem Bıyık, Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh. The Green Choice: Learning and Influencing Human Decisions on Shared Roads. Proceedings of the 58th IEEE Conference on Decision and Control (CDC), December 2019. [PDF]
+ AI agents need to collaborate and interact with humans in many different settings such as autonomous vehicles driving alongside humans, robots assisting humans in homes, AI assistants learning and leveraging human preferences.
+ On the other hand, humans surprisingly collaborate well together even in complex tasks by adapting to each other through repeated interactions.
+ Given humans' computational constraints (such as being bounded rational with access to limited memory or time), we believe the reason humans can easily interact with each other is that interactions, despite their apparent complexity, are inherently structured. What emerges from these repeated interactions is shared knowledge about the interaction history that enables them to trust each other.
+
+
+ Understanding repeated and long-term interaction of learning agents with humans introduces a set of theoretical and applied challenges for developing more effective AI agents that can coordinate, collaborate, or even positively influence humans. Specifically, we focus on two fundamental research directions: (1) developing representation learning algorithms that enable capturing the core of interaction for better coordination, collaboration, and influencing, and (2) effectively adapting to human partners over repeated interactions.
+
+
+
+
+
+
+
+
+
+
Game Theoretic Approaches for Formalizing Interaction
+
+
+ In our work [AURO 2018], we have focused on a game-theoretic and dynamical systems approach for modeling the interaction between humans and robots. Specifically, we have formalized the interaction between autonomous cars and human-driven cars as an underactuated dynamical system in order to go beyond simplistic models of other drivers on the road, e.g., models that treat human-driven cars as moving obstacles, and instead take into account a learning-based approach that incorporates expressive models of human actions and their responses to robots. We demonstrate that we can plan to influence human-driven cars when optimizing for better safety, efficiency, and coordination. We actively gather information about the driving style of other vehicles to discover their policies and influence them toward more desirable strategies.
+
+
+
+
Representations for Repeated and Continual Interactions
+ We build upon the important insight that humans and robots need to coordinate with each other over long-term and repeated interactions, and that game-theoretic techniques to build models of the partner are not scalable over continual repeated interactions.
+ We have thus developed a new and orthogonal paradigm that learns a low-dimensional representation in Markov games---which we refer to as conventions---that capture the core of interaction. Conventions are approximations of sufficient statistics needed for multi-agent coordination. This idea can enable long-term and adaptive interactive behavior in a scalable fashion.
+
+
+
+
+
+
+
+ The learned low-dimensional representation can correspond to a diverse set of entities such as assignment of leading and following roles in multi-robot games [AURO 2021], the listening and speaking roles in dyadic interactions, e.g., when two robots collaboratively transport an object [CoRL 2019], a latent action space for teleoperating an assistive robot [ICRA 2020, RSS 2020, IROS 2020, AURO 2021, L4DC 2021, CoRL 2021], a latent strategy or intent of partner policies [CoRL 2020, ICLR 2021, CoRL 2021], or even conventions developed through linguistic communication [CoNLL 2020].
+
+
+
+
+ We demonstrate that we can train a deep reinforcement learning policy that leverages these learned representations to better model the non-stationary partner strategy and further to plan and even influence the partner for reaching more effective long-term outcomes. Our algorithm LILI: Learning and Influencing Latent Intent can play a game of air-hockey with another partner (robot or human) without any prior knowledge of their strategy in real-time [CoRL 2020].
+
+
+Building upon this work, we have studied how to reduce non-stationarity in multi-agent reinforcement learning by stabilizing these learned representations. Our algorithm, SILI: Stable Influencing of Latent Intent stabilizes partner strategies in an effective way that leads to role assignments and solving an easier learning problem in multi-agent collaborations [CoRL 2021]. In addition, we have studied how these conventions adapt over repeated interactions and have proposed a modular approach that separately learns these conventions and their evolution [ICLR 2021].
+
+
+
+
+
An Application of Learned Representations: Assistive Teleoperation
+ For almost one million American adults living with physical disabilities, picking up a bite of food or pouring a glass of water presents a significant challenge. Wheelchair-mounted robotic arms -- and other physically assistive devices -- hold the promise of increasing user autonomy, reducing reliance on caregivers, and improving quality of life. Unfortunately, the very dexterity that makes these robotic assistants useful also makes them hard for humans to control. Today's users must teleoperate their assistive robots throughout entire tasks. For instance, when users control an assistive robot for eating, they would need to carefully orchestrate the position and orientation of the end-effector to move a fork to the plate, spear a morsel of food, and then guide the food back towards their mouth. These challenges are often prohibitive: users living with disabilities have reported that they choose not to leverage their assistive robot when eating because of the associated difficulty. Our key insight is that controlling high-dimensional robots can become easier by learning and leveraging low-dimensional representations of actions, which enable users to convey their intentions, goals, and plans to the robot using simple, intuitive, and low-dimensional inputs.
+
+
+
+
+
+
+
+
+Imagine that you are working with the assistive robot to grab food from your plate. Here we placed three marshmallows on a table in front of the user, and the person needs to make the robot grab one of these marshmallows using their joystick.
+
+
+
+
+
+Importantly, the robot does not know which marshmellow the human wants! Ideally, the robot will make this task easier by learning a simple mapping between the person's inputs and their desired marshmallow.
+
+
Woodrow Zhouyuan Wang, Andy Shih, Annie Xie, Dorsa Sadigh. Influencing Towards Stable Multi-Agent Interactions. Proceedings of the 5th Conference on Robot Learning (CoRL), 2021. [PDF]
+
Annie Xie, Dylan Losey, Ryan Tolsma, Chelsea Finn, Dorsa Sadigh. Learning Latent Representations to Influence Multi-Agent Interaction. Proceedings of the 4th Conference on Robot Learning (CoRL), 2020. [PDF]
+
Andy Shih, Arjun Sawhney, Jovana Kondic, Stefano Ermon, Dorsa Sadigh. On the Critical Role of Conventions in Adaptive Human-AI Collaboration. 9th International Conference on Learning Representations (ICLR), 2021. [PDF]
+
Siddharth Karamcheti*, Megha Srivastava*, Percy Liang, Dorsa Sadigh. LILA: Language-Informed Latent Actions. Proceedings of the 5th Conference on Robot Learning (CoRL), 2021. [PDF]
+
Dylan Losey, Hong Jun Jeon, Mengxi Li, Krishnan Srinivasan, Ajay Mandlekar, Animesh Garg, Jeannette Bohg, Dorsa Sadigh. Learning Latent Actions to Control Assistive Robots. Journal of Autonomous Robots (AURO), 2021. [PDF]
+
Hong Jun Jeon, Dylan Losey, Dorsa Sadigh. Shared Autonomy with Learned Latent Actions. Proceedings of Robotics: Science and Systems (RSS), July 2020. [PDF]
+
Mengxi Li*, Minae Kwon*, Dorsa Sadigh. Influencing Leading and Following in Human-Robot Teams. Journal of Autonomous Robots (AURO), 2021. [PDF]
+
Dorsa Sadigh, Nick Landolfi, S. Shankar Sastry, Sanjit A. Seshia, Anca D. Dragan. Planning for Cars that Coordinate with People: Leveraging Effects on Human Actions for Planning and Active Information Gathering over Human Internal State. Autonomous Robots (AURO), October 2018. [PDF]