kristofer revised this gist . Go to revision
1 file changed, 72 insertions
introtoLLMconcepts-limits.md(file created)
| @@ -0,0 +1,72 @@ | |||
| 1 | + | # A Short Personal Reference Guide to LLM Concepts & Limitations | |
| 2 | + | ||
| 3 | + | ## Core Concepts | |
| 4 | + | ||
| 5 | + | ### Foundation | |
| 6 | + | - **Large Language Models**: Neural networks trained on vast text data to predict next tokens | |
| 7 | + | - **Architecture**: Primarily transformer-based with attention mechanisms | |
| 8 | + | - **Parameters**: Model size/capacity (billions to trillions) | |
| 9 | + | - **Training**: Self-supervised learning on text corpora | |
| 10 | + | - **Inference**: Generation of text responses to prompts | |
| 11 | + | ||
| 12 | + | ### Key Capabilities | |
| 13 | + | - Text comprehension and generation | |
| 14 | + | - Pattern recognition across diverse domains | |
| 15 | + | - In-context learning from examples | |
| 16 | + | - Reasoning through chain-of-thought processes | |
| 17 | + | - Knowledge retrieval from training data | |
| 18 | + | ||
| 19 | + | ### Interaction Methods | |
| 20 | + | - **Prompting**: Crafting effective instructions | |
| 21 | + | - **RAG**: Retrieval-Augmented Generation for factual grounding | |
| 22 | + | - **Fine-tuning**: Adapting models to specific tasks/domains | |
| 23 | + | - **Function calling**: Enabling tool use and API integration | |
| 24 | + | ||
| 25 | + | ## Critical Limitations | |
| 26 | + | ||
| 27 | + | ### Knowledge Constraints | |
| 28 | + | - **Knowledge cutoff**: Limited to training data timeframe | |
| 29 | + | - **Hallucinations**: Confidently generating false information | |
| 30 | + | - **Shallow expertise**: Broad but often superficial domain knowledge | |
| 31 | + | ||
| 32 | + | ### Reasoning Gaps | |
| 33 | + | - **Mathematical reasoning**: Struggles with complex calculations | |
| 34 | + | - **Spatial reasoning**: Limited understanding of physical space/geometry | |
| 35 | + | - **Causal reasoning**: Correlation vs. causation confusion | |
| 36 | + | - **Logical consistency**: Contradictions across longer contexts | |
| 37 | + | ||
| 38 | + | ### Context Handling | |
| 39 | + | - **Context window**: Limited text history (tokens) maintained | |
| 40 | + | - **Memory**: No persistent memory between sessions | |
| 41 | + | - **World model**: Incomplete understanding of physical reality | |
| 42 | + | ||
| 43 | + | ### Social Limitations | |
| 44 | + | - **Theory of mind**: Limited understanding of human intentions/beliefs | |
| 45 | + | - **Cultural nuance**: Struggles with complex cultural contexts | |
| 46 | + | - **Ethical reasoning**: Simplified moral frameworks | |
| 47 | + | ||
| 48 | + | ### Technical Constraints | |
| 49 | + | - **Latency**: Generation speed limitations | |
| 50 | + | - **Computing costs**: Resource intensity of large models | |
| 51 | + | - **Error recovery**: Difficulty recognizing and correcting mistakes | |
| 52 | + | - **Multimodal integration**: Limited understanding across modalities | |
| 53 | + | ||
| 54 | + | ## Mitigation Strategies | |
| 55 | + | ||
| 56 | + | ### Enhancing Reliability | |
| 57 | + | - Tool integration for verified calculations/lookups | |
| 58 | + | - Chain-of-thought prompting for complex reasoning | |
| 59 | + | - Knowledge retrieval from external databases | |
| 60 | + | - Explicit verification steps for factual claims | |
| 61 | + | ||
| 62 | + | ### Improving Output | |
| 63 | + | - Clear instructions with specific format expectations | |
| 64 | + | - Example-based prompting for consistency | |
| 65 | + | - Breaking complex tasks into simpler components | |
| 66 | + | - Focused domain-specific prompting | |
| 67 | + | ||
| 68 | + | ### Safety Practices | |
| 69 | + | - Avoid overreliance on LLM outputs for critical decisions | |
| 70 | + | - Verify factual claims through independent sources | |
| 71 | + | - Maintain human oversight for sensitive applications | |
| 72 | + | - Recognize appropriate use cases and limitations | |
Newer
Older