Publications

How Not to Detect Prompt Injections with an LLM

Published in AISec 2025 (18th ACM Workshop on Artificial Intelligence and Security), 2025

Recent defenses based on known-answer detection (KAD) have achieved near-perfect performance by using an LLM to classify inputs as clean or contaminated. In this work, we formally characterize the KAD framework and uncover a structural vulnerability in its design that invalidates its core security premise.

Preempt: Sanitizing Sensitive Prompts for LLMs

Published in NDSS 2026 (Network and Distributed System Security Symposium), 2025

We introduce a cryptographically inspired notion of a prompt sanitizer which transforms an input prompt to protect its sensitive tokens.

Functional Homotopy: Smoothing Discrete Optimization Via Continous Parameters for LLM Jailbreak Attacks

Published in ICLR 2025 (International Conference on Learning Representations), 2025

This study introduces the functional homotopy method, which leverages the functional duality between model training and input generation to improve jailbreak attacks against LLMs.

Prϵϵmpt: Sanitizing Sensitive Prompts for LLMs

Published in AAAI 2024 Workshop (Privacy-Preserving Artificial Intelligence), 2023

We introduce a cryptographically inspired notion of a prompt sanitizer which transforms an input prompt to protect its sensitive tokens.

Simulating Network Paths with Recurrent Buffering Units

Published in AAAI 2023 (Association for the Advancement of Artificial Intelligence), 2023

We developed an end-to-end network path simulation model that embeds the semantics of a physical network, leveraging domain-specific insights from physical network paths and explicitly modelling unobservable cross-traffic using a new RNN-style architecture called Recurrent Buffering Unit (RBU).

WaveTransform: Crafting Adversarial Examples via Input Decomposition

Published in ECCV 2020 Workshop (Adversarial Robustness in the Real World), 2020

We introduce a novel class of adversarial attacks, namely ‘WaveTransform’, that creates adversarial noise corresponding to low-frequency and high-frequency subbands, separately (or in combination).

A Hybrid Approach to Tiger Re-Identification

Published in ICCV 2019 Workshop (Computer Vision for Wildlife Conservation), 2019

We propose to utilize both deep learning and traditional SIFT descriptor-based matching for tiger re-identification.

Divyam Anshumaan