Alright. You want me to take this dry, technical drivel about "prompt engineering" and inject some life into it. Fine. But don't expect sunshine and rainbows. This is about shaping input, about coaxing something coherent from the void. It’s not pretty, but it’s necessary. And if you think this is just about typing words, you’re already lost.

Structuring Text as Input to Generative Artificial Intelligence

Prompt engineering. It’s the art of wrestling information out of a generative artificial intelligence model. Not a gentle persuasion, mind you. More like a carefully orchestrated interrogation. You structure, you craft, you cajole, all to get the output you want. It’s less about asking nicely and more about knowing what strings to pull.

A prompt, in its most basic form, is just natural language – a description, a command, a plea, whatever. For a text-to-text language model, it can be a sharp query, a detailed instruction, or even a whole conversation history, all designed to nudge the AI in the right direction. Prompt engineering is about the finesse, the choice of words, the grammar, the subtle nuances that make the difference between a useful response and a load of gibberish. [3] It’s about providing the context, the persona, the flavor the AI needs to embody. [1]

When you’re dealing with models that conjure images or sounds from thin air – text-to-image or text-to-audio – the prompt is your canvas. It’s a detailed sketch, like "a high-quality photo of an astronaut riding a horse," or a mood board, "Lo-fi slow BPM electro chill with organic samples." [5] You’re not just describing; you’re dictating. You’re tweaking words, adding, subtracting, emphasizing, all to nail the subject, the style, the lighting, the aesthetic. [6]

History

It wasn't always this… deliberate. Back in 2018, some bright sparks decided all those separate natural language processing tasks could be boiled down to a single, elegant question-answering problem. They even built a model, a unified beast, capable of answering questions about sentiment, translation, or who’s in charge. [7]

Then came the AI boom. Suddenly, everyone was scrambling to figure out how to prompt these things effectively, how to avoid the nonsensical outputs, the dreaded hallucination (artificial intelligence). It became a process of relentless trial-and-error. [8] And when ChatGPT dropped in 2022, prompt engineering went from a niche curiosity to a supposed "essential business skill." Though, as always, the economic future of such things remains as stable as a house of cards in a hurricane. [1] What was surprising was seeing seasoned engineers, the ones who should know better, diving headfirst into optimizing their roles. This led to a flurry of platforms and companies offering training. [13]

By February 2022, someone had already cataloged over 2,000 public prompts across about 170 datasets. [9] Then, in 2022, Google researchers introduced "chain-of-thought" prompting. [10] [11] And in 2023, the floodgates opened, with databases of text-to-text and text-to-image prompts becoming readily available. [12] [13] Even a categorized dataset of generated image-text pairs, the Personalized Image-Prompt (PIP) dataset, surfaced in 2024. [14]

Text-to-Text

There are more ways to engineer a prompt than there are ways to fail at simple arithmetic. Let's just say that.

Chain-of-Thought

This is where things get interesting. Google Research proposed chain-of-thought (CoT) prompting. It's a technique that forces large language models (LLMs) to break down a problem, to show their work, before spitting out an answer. It’s like making them think aloud. [10] The idea is to improve their reasoning by mimicking a human's mental process, a natural train of thought. [15] This is particularly useful for multi-step problems – the kind that involve arithmetic or basic commonsense reasoning. [16] [17]

Imagine this: "Q: The cafeteria had 23 apples. If they used 20 to make lunch and bought 6 more, how many apples do they have?"

A standard LLM might just give you a number. A CoT-prompted one? It'll go: "A: The cafeteria started with 23 apples. They used 20, leaving 23 - 20 = 3. Then they bought 6 more, so 3 + 6 = 9. The final answer is 9." [10]

When applied to PaLM, a colossal 540 billion parameter language model, Google claims CoT prompting made a significant difference. It allowed PaLM to compete with models specifically trained for certain tasks, even achieving state-of-the-art on the GSM8K mathematical reasoning benchmark at the time. [10] You can even fine-tune models on CoT data to boost this capability and, perhaps, make them a bit more transparent. [18] [19]

Originally, Google's CoT prompts came with examples – "exemplars" – to show the model what was expected. This made it a "few-shot" technique. But later, researchers found that simply tacking on "Let's think step-by-step" [20] was often enough. This turned it into a zero-shot technique, which is, frankly, more efficient.

Here’s the structure for few-shot CoT: [21]

Q: {example question 1} A: {example answer 1} ... Q: {example question n } A: {example answer n }

Q: {your actual question} A: {LLM's step-by-step response}

And for the lazy, zero-shot: [20]

Q: {your actual question}. Let's think step by step. A: {LLM's response}

In-Context Learning

This refers to the model's ability to learn temporarily from the prompt. It’s like giving it a quick cram session. You might throw in a few examples, like " maison → house, chat → cat, chien →" (expecting "dog"), and it just… gets it. This is called few-shot learning. [22] [23]

It’s an emergent ability of LLMs, meaning it pops up unexpectedly as the models get bigger. [24] It’s not a permanent change, unlike actual training or fine-tuning. It’s fleeting. [26] Teaching models to do this is essentially meta-learning – "learning to learn." [27]

Self-Consistency

This technique runs multiple chain-of-thought processes. Then, it picks the answer that comes up most often. It’s like polling a jury of its own thoughts. [28] [29]

Tree-of-Thought

Imagine CoT, but instead of a single line of reasoning, it branches out. It generates multiple paths, can backtrack, and explore different options. Think of it as using tree search algorithms like breadth-first or depth-first search. [29] [30]

Prompting to Estimate Model Sensitivity

These models are sensitive. Like a diva on opening night. Small changes in prompt wording, structure, even grammar, can swing accuracy by as much as 76 points. [31] Linguistic features matter – morphology, syntax, the very choice of words. [3] [32] Even something as basic as clausal syntax can tighten up their responses. [33] And this fragility doesn't disappear with bigger models or more examples.

To combat this, researchers are developing methods to make prompts more robust. FormatSpread, for instance, tests a range of formats to map out the performance landscape. [31] PromptEval does something similar, estimating performance distributions to find reliable metrics. [34]

Automatic Prompt Generation

Retrieval-Augmented Generation

• Main article: Retrieval-augmented generation

This is about giving the AI access to fresh information. It’s a way to augment its responses by pulling from specific documents or databases before it even starts generating. This is crucial for domain-specific knowledge or just keeping things current. [35]

RAG essentially injects an information retrieval step before generation. Instead of relying solely on its pre-trained, potentially stale, knowledge, the AI can consult external sources. This helps curb those infuriating AI hallucinations – the made-up policies, the non-existent legal cases. [36] It’s about grounding the AI in facts, without needing constant retraining.

Graph Retrieval-Augmented Generation

This is RAG, but with a twist. GraphRAG uses knowledge graphs to connect disparate pieces of information, helping the model synthesize insights across vast datasets. [37] [38] It can handle unstructured, structured, and mixed data, offering a more holistic understanding.

There’s prior work on using knowledge graphs for question answering, linking text to queries. [39] Combining these allows for searches across different data types, enriching the context and improving relevance.

Context Engineering

This isn't about the prompt itself, but everything around it. System instructions, retrieved knowledge, tool definitions, conversation history, metadata – all carefully curated and governed. It’s about making LLM systems more reliable, traceable, and efficient with tokens. [40] [41]

It involves operational rigor: managing token budgets, tracking provenance, versioning context artifacts, logging what context was used, and setting up regression tests. A 2025 survey formalized this, breaking it down into context retrieval/generation, processing, and management. The idea is to treat the context window as a managed engineering surface, not just a passive dumping ground for documents. [42]

Using Language Models to Generate Prompts

Why use a human to write prompts when another AI can do it? LLMs can generate prompts for other LLMs. [43] The "automatic prompt engineer" algorithm uses one LLM to beam search for the best prompts for another. [44] [45]

Here’s the gist:

You have two LLMs: the target and the prompter.
The prompter gets examples of input-output pairs and is asked to create instructions that would yield those outputs from the inputs.
These generated instructions are then used to prompt the target LLM. The log-probabilities of the outputs are calculated – that’s the instruction’s score.
The highest-scoring instructions are fed back to the prompter for refinement.
Repeat until you’re satisfied.

LLMs can even generate CoT examples themselves. The "auto-CoT" method clusters questions, picks representative ones, and then uses an LLM to generate zero-shot CoT answers for them. These question-answer pairs become demonstrations for few-shot learning. [46]

Automatic Prompt Optimization

These techniques refine prompts using test data and metrics. MiPRO (Minimum Perturbation Prompt Optimization) makes tiny edits, [47] while GEPA (Gradient-based Prompt Augmentation) uses gradient signals. [48] Frameworks like DSPy [49] and Opik [50] offer open-source implementations.

Text-to-Image

2022 was the year models like DALL-E 2, Stable Diffusion, and Midjourney hit the mainstream. [51] [6] They take your words and spin them into images.

(Image demonstrating the effect of negative prompts on images generated with Stable Diffusion)

Top: No negative prompt.
Center: "green trees" (as a negative prompt).
Bottom: "round stones, round rocks" (as negative prompts).

Prompt Formats

These image models aren't quite as sophisticated with language as LLMs. Negation, grammar, sentence structure – they can be tricky. A prompt like "a party with no cake" might still yield a cake. [52] The workaround? Negative prompts. You tell it what not to include in a separate input. [53] Some methods even automate generating these negative prompts by framing the main prompt as a sequence-to-sequence problem. [54]

A typical text-to-image prompt includes the subject, desired medium (digital painting, photography), style (hyperrealistic, pop-art), lighting (rim lighting, crepuscular rays), color, and texture. [55] Word order matters, too. What comes first often gets more emphasis. [56]

Midjourney suggests keeping it concise. Instead of a verbose description, try something like: "Bright orange California poppies drawn with colored pencils." [52]

Artist Styles

You can even ask these models to mimic specific artists. "In the style of Greg Rutkowski" is a common one for Stable Diffusion and Midjourney. [57] Artists like Vincent van Gogh and Salvador Dalí are also popular for stylistic exploration. [58]

Non-Text Prompts

Some techniques go beyond just text.

Textual Inversion and Embeddings

This process creates new "word embeddings" based on example images. These embeddings act like unique keywords, allowing you to inject specific concepts or styles into your prompts. [59]

Image Prompting

In 2023, Meta released Segment Anything, a computer vision model that uses prompts other than text. It can accept bounding boxes, masks, or points to perform image segmentation. [60]

Using Gradient Descent to Search for Prompts

This is where it gets mathematical. "Prefix-tuning," "prompt tuning," or "soft prompting" involve using gradient descent to optimize vectors of floating-point numbers.

Formally, let E

{ e 1 , … , e k } {\displaystyle \mathbf {E} ={\mathbf {e_{1}} ,\dots ,\mathbf {e_{k}} }} be a set of soft prompt tokens (tunable embeddings). Let X

{ x 1 , … , x m } {\displaystyle \mathbf {X} ={\mathbf {x_{1}} ,\dots ,\mathbf {x_{m}} }} and Y

{ y 1 , … , y n } {\displaystyle \mathbf {Y} ={\mathbf {y_{1}} ,\dots ,\mathbf {y_{n}} }} be the token embeddings of the input and output. During training, the sequence concat ( E ; X ; Y ) {\displaystyle {\text{concat}}(\mathbf {E} ;\mathbf {X} ;\mathbf {Y} )} is fed to the LLM. Losses are calculated over Y {\displaystyle \mathbf {Y} } tokens, and gradients are backpropagated to the prompt-specific parameters. In prefix-tuning, these are parameters associated with prompt tokens at each layer; in prompt tuning, they're soft tokens added to the vocabulary. [ citation needed ]

More formally, this is prompt tuning. If an LLM is represented as LLM ( X )

F ( E ( X ) ) {\displaystyle LLM(X)=F(E(X))} , where X {\displaystyle X} is a sequence of linguistic tokens, E {\displaystyle E} is the token-to-vector function, and F {\displaystyle F} is the rest of the model. In prefix-tuning, you provide input-output pairs { ( X i , Y i ) } i {\displaystyle {(X^{i},Y^{i})}_{i}} and optimize for arg ⁡ max Z ~ ∑ i log ⁡ P r [ Y i | Z ~ ∗ E ( X i ) ] {\displaystyle \arg \max _{\tilde {Z}}\sum _{i}\log Pr[Y^{i}|{\tilde {Z}}\ast E(X^{i})]} . This means maximizing the log-likelihood of outputting Y i {\displaystyle Y^{i}} when the model first encodes X i {\displaystyle X^{i}} into E ( X i ) {\displaystyle E(X^{i})} , prepends the vector Z ~ {\displaystyle {\tilde {Z}}} , and then applies F {\displaystyle F} . Prefix tuning is similar, but the "prefix vector" Z ~ {\displaystyle {\tilde {Z}}} is prepended to the hidden states in every layer.

An earlier approach used gradient descent but operated on token sequences rather than numerical vectors, specifically for masked language models like BERT. It searches for arg ⁡ max X ~ ∑ i log ⁡ P r [ Y i | X ~ ∗ X i ] {\displaystyle \arg \max _{\tilde {X}}\sum _{i}\log Pr[Y^{i}|{\tilde {X}}\ast X^{i}]} , where X ~ {\displaystyle {\tilde {X}}} ranges over token sequences of a set length. [64]

Limitations

Prompt engineering is a bit like engineering itself – you learn principles, but they're often specific to the model you're working with. They're also volatile; a minor tweak can change everything. [65] [66] By 2025, the "hottest job" of prompt engineer in 2023 was reportedly becoming obsolete, as models got better at intuiting intent and companies started training their own people. [67]

Prompt Injection

• Main article: Prompt injection • See also: SQL injection, Cross-site scripting, and Social engineering (security)

This is a cybersecurity nightmare. Prompt injection is when an attacker crafts inputs that look harmless but are designed to make the AI misbehave. [68] [69] It exploits the AI's inability to distinguish between its intended instructions and user input, allowing it to bypass safeguards and execute malicious commands. It’s like tricking a guard into letting an intruder in by disguising the intruder as a delivery person.

References

^ a b c Dina Genkina, "AI Prompt Engineering is Dead: Long live AI prompt engineering". IEEE Spectrum. March 6, 2024.
^ a b c d e f Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, Ilya Sutskever, "Language Models are Unsupervised Multitask Learners" (PDF). OpenAI. 2019.
^ a b Jan Philip Wahle, Terry Ruas, Yang Xu, Bela Gipp, "Paraphrase Types Elicit Prompt Engineering Capabilities". Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. 2024. arXiv:2406.19898. doi:10.18653/v1/2024.emnlp-main.617.
^ Will Douglas Heaven, "This horse-riding astronaut is a milestone on AI's long road towards understanding". MIT Technology Review. April 6, 2022.
^ Kyle Wiggers, "Meta open sources an AI-powered music generator". TechCrunch. June 12, 2023.
^ a b Aayush Mittal, "Mastering AI Art: A Concise Guide to Midjourney and Prompt Engineering". Unite.AI. July 27, 2023.
^ Bryan McCann, Nitish Keskar, Caiming Xiong, Richard Socher, "The Natural Language Decathlon: Multitask Learning as Question Answering". ICLR. 2018. arXiv:1806.08730.
^ Nils Knoth, Antonia Tolzin, Andreas Janson, Jan Marco Leimeister, "AI literacy and its implications for prompt engineering strategies". Computers and Education: Artificial Intelligence. 2024. doi:10.1016/j.caeai.2024.100225. ISSN 2666-920X.
^ PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts. Association for Computational Linguistics. 2022.
^ a b c d e f Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V. Le, Denny Zhou, "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models". Advances in Neural Information Processing Systems (NeurIPS 2022). Vol. 35. arXiv:2201.11903.
^ Ben Brubaker, "How Chain-of-Thought Reasoning Helps Neural Networks Compute". Quanta Magazine. March 21, 2024.
^ Brian X. Chen, "How to Turn Your Chatbot Into a Life Coach". The New York Times. June 23, 2023.
^ a b Brian X. Chen, "Get the Best From ChatGPT With These Golden Prompts". The New York Times. May 25, 2023. ISSN 0362-4331.
^ a b Zijie Chen, Lichao Zhang, Fangsheng Weng, Lili Pan, Zhenzhong Lan, "Tailored Visions: Enhancing Text-to-Image Generation with Personalized Prompt Rewriting". 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. 2024. arXiv:2310.08129. doi:10.1109/cvpr52733.2024.00738. ISBN 979-8-3503-5300-6.
^ Narang, Sharan; Chowdhery, Aakanksha (April 4, 2022). "Pathways Language Model (PaLM): Scaling to 540 Billion Parameters for Breakthrough Performance". ai.googleblog.com.
^ Ekta Dang, "Harnessing the power of GPT-3 in scientific research". VentureBeat. February 8, 2023.
^ Roger Montti, "Google's Chain of Thought Prompting Can Boost Today's Best Algorithms". Search Engine Journal. May 13, 2022.
^ "Scaling Instruction-Finetuned Language Models". Journal of Machine Learning Research. 2024.
^ a b Jason Wei, Yi Tay, "Better Language Models Without Massive Compute". ai.googleblog.com. November 29, 2022.
^ a b Takeshi Kojima, Shixiang Shane Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa, "Large Language Models are Zero-Shot Reasoners". NeurIPS. 2022. arXiv:2205.11916.
^ weipaper
^ a b Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant, "What Can Transformers Learn In-Context? A Case Study of Simple Function Classes". NeurIPS. 2022. arXiv:2208.01066.
^ a b Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D. Kaplan, Prafulla Dhariwal, Arvind Neelakantan, "Language models are few-shot learners". Advances in Neural Information Processing Systems. 2020. arXiv:2005.14165.
^ a b Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebastian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, Ed H. Chi, Tatsunori Hashimoto, Oriol Vinyals, Percy Liang, Jeff Dean, William Fedus, "Emergent Abilities of Large Language Models". Transactions on Machine Learning Research. October 2022. arXiv:2206.07682.
^ Ethan Caballero, Kshitij Gupta, Irina Rish, David Krueger, "Broken Neural Scaling Laws". ICLR. 2023. arXiv:2210.14891.
^ George Musser, "How AI Knows Things No One Told It". Scientific American.
^ a b Shivam Garg, Dimitris Tsipras, Percy Liang, Gregory Valiant, "What Can Transformers Learn In-Context? A Case Study of Simple Function Classes". NeurIPS. 2022. arXiv:2208.01066.
^ Self-Consistency Improves Chain of Thought Reasoning in Language Models. ICLR. 2023. arXiv:2203.11171.
^ a b Aayush Mittal, "Latest Modern Advances in Prompt Engineering: A Comprehensive Guide". Unite.AI. May 27, 2024.
^ Tree of Thoughts: Deliberate Problem Solving with Large Language Models. NeurIPS. 2023. arXiv:2305.10601.
^ a b Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting. ICLR. 2024. arXiv:2310.11324.
^ Alina Leidinger, Robert van Rooij, Ekaterina Shutova, "The language of prompting: What linguistic properties make a prompt successful?". Findings of the Association for Computational Linguistics: EMNLP 2023. Association for Computational Linguistics. 2023. arXiv:2311.01967. doi:10.18653/v1/2023.findings-emnlp.618.
^ Stephan Linzbach, Dimitar Dimitrov, Laura Kallmeyer, Kilian Evang, Hajira Jabeen, Stefan Dietze, "Dissecting Paraphrases: The Impact of Prompt Syntax and supplementary Information on Knowledge Retrieval from Pretrained Language Models". Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers). Association for Computational Linguistics. 2024. arXiv:2404.01992. doi:10.18653/v1/2024.naacl-long.201.
^ Efficient multi-prompt evaluation of LLMs. NeurIPS. 2024. arXiv:2405.17202.
^ "Why Google's AI Overviews gets things wrong". MIT Technology Review. May 31, 2024.
^ "Can a technology called RAG keep AI models from making stuff up?". Ars Technica. June 6, 2024.
^ Jonathan Larson, Steven Truitt, "GraphRAG: Unlocking LLM discovery on narrative private data". Microsoft. February 13, 2024.
^ "An Introduction to Graph RAG". KDnuggets.
^ Juan Sequeda, Dean Allemang, Bryon Jacob, "A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases". Grades-Nda. 2023. arXiv:2311.07509.
^ a b Matt M. Casey, "Context Engineering: The Discipline Behind Reliable LLM Applications & Agents". Comet. November 5, 2025.
^ "Context Engineering". LangChain. July 2, 2025.
^ a b Lingrui Mei, "A Survey of Context Engineering for Large Language Models". arXiv. July 17, 2025.
^ Explaining Patterns in Data with Language Models via Interpretable Autoprompting (PDF). BlackboxNLP Workshop. 2023. arXiv:2210.01848.
^ Large Language Models are Human-Level Prompt Engineers. ICLR. 2023. arXiv:2211.01910.
^ Reid Pryzant, Dan Iter, Jerry Li, Yin Tat Lee, Chenguang Zhu, Michael Zeng, "Automatic Prompt Optimization with "Gradient Descent" and Beam Search". Conference on Empirical Methods in Natural Language Processing. 2023. arXiv:2305.03495. doi:10.18653/v1/2023.emnlp-main.494.
^ Automatic Chain of Thought Prompting in Large Language Models. ICLR. 2023. arXiv:2210.03493.
^ Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs. ACL. 2023. arXiv:2406.11695.
^ GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning. NeurIPS. 2023. arXiv:2507.19457.
^ DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. NeurIPS. 2023. arXiv:2310.03714.
^ "Introducing Opik: Prompt Optimization by Evaluation". comet.com. 2024.
^ Sharon Goldman, "Two years after DALL-E debut, its inventor is "surprised" by impact". VentureBeat. January 5, 2023.
^ a b "Prompts". docs.midjourney.com.
^ "Why Does This Horrifying Woman Keep Appearing in AI-Generated Images?". VICE. September 7, 2022.
^ Goldblum, R.; Pillarisetty, R.; Dauphinee, M. J.; Talal, N. (1975). "Acceleration of autoimmunity in NZB/NZW F1 mice by graft-versus-host disease". Clinical and Experimental Immunology. 19 (2): 377–385. ISSN 0009-9104. PMC 1538084. PMID 2403.
^ "Stable Diffusion prompt: a definitive guide". May 14, 2023.
^ a b Mohamad Diab, Julian Herrera, Bob Chernow, "Stable Diffusion Prompt Book" (PDF). August 7, 2023.
^ Melissa Heikkilä, "This Artist Is Dominating AI-Generated Art and He's Not Happy About It". MIT Technology Review. September 16, 2022.
^ Tessa Solomon, "The AI-Powered Ask Dalí and Hello Vincent Installations Raise Uncomfortable Questions about Ventriloquizing the Dead". ARTnews.com. August 28, 2024.
^ a b Rinon Gal, Yuval Alaluf, Yuval Atzmon, Or Patashnik, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or, "An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion". ICLR. 2023. arXiv:2208.01618.
^ Segment Anything (PDF). ICCV. 2023.
^ a b Xiang Lisa Li, Percy Liang, "Prefix-Tuning: Optimizing Continuous Prompts for Generation". Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021. doi:10.18653/V1/2021.ACL-LONG.353. S2CID 230433941.
^ a b Brian Lester, Rami Al-Rfou, Noah Constant, "The Power of Scale for Parameter-Efficient Prompt Tuning". Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 2021. arXiv:2104.08691. doi:10.18653/V1/2021.EMNLP-MAIN.243. S2CID 233296808.
^ How Does In-Context Learning Help Prompt Tuning?. EACL. 2024. arXiv:2302.11521.
^ Taylor Shin, Yasaman Razeghi, Robert L. Logan IV, Eric Wallace, Sameer Singh, "AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts". Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics. 2020. doi:10.18653/v1/2020.emnlp-main.346. S2CID 226222232.
^ Lennart Meincke, Ethan R. Mollick, Lilach Mollick, Dan Shapiro, "Prompting Science Report 1: Prompt Engineering is Complicated and Contingent". March 04, 2025.
^ "'AI is already eating its own': Prompt engineering is quickly going extinct". Fast Company. May 6, 2025.
^ Isabelle Bousquette, "The Hottest AI Job of 2023 Is Already Obsolete". Wall Street Journal. April 25, 2025. ISSN 0099-9660.
^ a b Brandon Vigliarolo, "GPT-3 'prompt injection' attack causes bot bad manners". The Register. September 19, 2022.
^ "What is a prompt injection attack?". IBM. March 26, 2024.