Here are three papers describing different side-channel attacks against LLMs. “ Remote Timing Attacks on Efficient Language Model Inference “: Abstract: Scaling up language models has significantly increased their capabilities. But larger models are slower models, and so there is now an extensive body of work (e.g., speculative sampling or parallel decoding) that…