Exposing Long-Tail Safety Failures in Large Language Models through Efficient Diverse Response Sampling
Reviewed on OpenReview: https://openreview.net/forum?id=tHfAskovWI
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Reviewed on OpenReview: https://openreview.net/forum?id=tHfAskovWI