Linguistic Steganography via Self-Adjusting Asymmetric Number System
Keywords:
Linguistic steganography, automatic text generation, GPT-2, Asymmetric number systemAbstract
Linguistic steganography seeks to conceal secret information within natural language text. However, existing methods often struggle to balance stego text quality with embedding efficiency, largely due to limitations in generation strategies and coding mechanisms. We propose SA-ANS, a self-adaptive linguistic steganography framework based on a self-adjusting Asymmetric Numeral System. SA-ANS allows user-specified embedding rates and employs probabilistic coding with adaptive candidate selection, dynamically tailoring the token pool to the language model’s probability distribution. This design produces fluent, semantically coherent stego text while preserving statistical indistinguishability from natural language. Extensive experiments on multiple benchmark datasets, evaluated across embedding efficiency, linguistic quality, statistical similarity, robustness to steganalysis, and human judgment, show that SA-ANS consistently outperforms state-of-the-art methods, demonstrating both effectiveness and practicality.