Last active 1746714721

introtoLLMconcepts-limits.md Raw

A Short Personal Reference Guide to LLM Concepts & Limitations

Core Concepts

Foundation

  • Large Language Models: Neural networks trained on vast text data to predict next tokens
  • Architecture: Primarily transformer-based with attention mechanisms
  • Parameters: Model size/capacity (billions to trillions)
  • Training: Self-supervised learning on text corpora
  • Inference: Generation of text responses to prompts

Key Capabilities

  • Text comprehension and generation
  • Pattern recognition across diverse domains
  • In-context learning from examples
  • Reasoning through chain-of-thought processes
  • Knowledge retrieval from training data

Interaction Methods

  • Prompting: Crafting effective instructions
  • RAG: Retrieval-Augmented Generation for factual grounding
  • Fine-tuning: Adapting models to specific tasks/domains
  • Function calling: Enabling tool use and API integration

Critical Limitations

Knowledge Constraints

  • Knowledge cutoff: Limited to training data timeframe
  • Hallucinations: Confidently generating false information
  • Shallow expertise: Broad but often superficial domain knowledge

Reasoning Gaps

  • Mathematical reasoning: Struggles with complex calculations
  • Spatial reasoning: Limited understanding of physical space/geometry
  • Causal reasoning: Correlation vs. causation confusion
  • Logical consistency: Contradictions across longer contexts

Context Handling

  • Context window: Limited text history (tokens) maintained
  • Memory: No persistent memory between sessions
  • World model: Incomplete understanding of physical reality

Social Limitations

  • Theory of mind: Limited understanding of human intentions/beliefs
  • Cultural nuance: Struggles with complex cultural contexts
  • Ethical reasoning: Simplified moral frameworks

Technical Constraints

  • Latency: Generation speed limitations
  • Computing costs: Resource intensity of large models
  • Error recovery: Difficulty recognizing and correcting mistakes
  • Multimodal integration: Limited understanding across modalities

Mitigation Strategies

Enhancing Reliability

  • Tool integration for verified calculations/lookups
  • Chain-of-thought prompting for complex reasoning
  • Knowledge retrieval from external databases
  • Explicit verification steps for factual claims

Improving Output

  • Clear instructions with specific format expectations
  • Example-based prompting for consistency
  • Breaking complex tasks into simpler components
  • Focused domain-specific prompting

Safety Practices

  • Avoid overreliance on LLM outputs for critical decisions
  • Verify factual claims through independent sources
  • Maintain human oversight for sensitive applications
  • Recognize appropriate use cases and limitations