Skip to content
View junkim100's full-sized avatar
πŸ‘€
:)
πŸ‘€
:)

Highlights

  • Pro

Organizations

@EdwinMichaelLab

Block or report junkim100

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
junkim100/README.md

Hi there, I'm Dong Jun Kim πŸ‘‹

I’m a Ph.D. researcher at Korea University’s NLP&AI Lab, specializing in Mechanistic Interpretability of Large Language Models (LLMs). My work focuses on reverse-engineering the inner workings of LLMs to uncover their decision-making processes, enhance transparency, and ensure alignment with human values. I am passionate about advancing AI research responsibly by bridging theoretical insights with practical applications.

πŸš€ Research Interests

  • Mechanistic Interpretability
    Reverse-engineering LLMs to understand their internal circuits, algorithms, and emergent behaviors. My work includes:

    • Developing sparse autoencoders to isolate interpretable features in transformer architectures
    • Probing causal relationships between model components and specific capabilities
    • Mapping information pathways in large-scale models to improve transparency and reliability
  • AI Safety
    Ensuring the safe and ethical deployment of AI systems by focusing on:

    • Automated attack detection (harmfulness/bias detection) using red-teaming techniques
    • Mitigating biases in LLMs through interpretability-driven methods
    • Designing scalable frameworks for aligning AI behavior with human values
  • Mechanistic Anomaly Detection
    Identifying unexpected or harmful behaviors in LLMs by analyzing their internal mechanisms. Key contributions include:

    • Developing tools to trace causal pathways for diagnosing anomalous outputs
    • Designing self-monitoring models for real-world deployment scenarios
    • Improving robustness under adversarial or high-stakes conditions
  • Reasoning Models & Agent Systems
    Investigating how LLMs reason and interact as agents to perform complex tasks reliably. My research focuses on:

    • Multi-step reasoning processes within transformer-based architectures
    • Building RAG agent systems for dynamic knowledge retrieval and integration
    • Exploring compositionality in neural networks for structured reasoning
  • Retrieval-Augmented Generation (RAG)
    Enhancing generative models with retrieval mechanisms to improve factual accuracy and groundedness. My work includes:

    • Designing retrieval pipelines optimized for domain-specific applications
    • Reducing hallucinations by embedding retrieval mechanisms into transformer workflows
    • Improving factual consistency in generative outputs through hybrid architectures
  • Cognitive Alignment & Ethical AI Design
    Ensuring that AI systems operate consistently with ethical principles and human intentions. Contributions include:

    • Embedding ethical guidelines into model training processes through scalable fine-tuning methods
    • Leveraging interpretability tools to monitor alignment over time
    • Collaborating across disciplines to design responsible AI frameworks

πŸ›  Core Expertise

  • Mechanistic Interpretability: Developing novel techniques to analyze the internal structures and decision-making pathways of LLMs.
  • AI Safety & Bias Mitigation: Creating robust frameworks to ensure ethical and safe deployment of advanced AI systems.
  • RAG & Agent Systems: Designing retrieval-augmented generation pipelines and agent-based systems for dynamic decision-making tasks.

🧰 Tools & Technologies

Machine Learning Frameworks:

PyTorchΒ  TensorFlowΒ  Hugging FaceΒ  LangChainΒ  DeepSpeedΒ  FlaxΒ 

Tools & Platforms:

DockerΒ  KubernetesΒ  StreamlitΒ  Weights & BiasesΒ  CodaLabΒ 

Optimization & Acceleration:

ONNX RuntimeΒ  TensorRT OptimizationΒ 

Additional Frameworks for LLM Development:

  • LlamaIndex: Efficient data indexing and retrieval for LLM-driven applications.
  • Haystack: End-to-end NLP framework for building search systems and conversational agents.
  • LangFlow: Tool for orchestrating complex workflows in LangChain-based applications.
  • Helicone: Observability platform tailored for monitoring LLM-powered applications.
  • Gemini (Google): Advanced multimodal language models optimized for enterprise-level tasks.

πŸ“š Education & Research Experience

Korea University (2024 – Present)

As a Ph.D. researcher at the NLP&AI Lab under Dr. Heui-Seok Lim, I have contributed to government and industry-funded projects, including collaborations with the Ministry of Food and Drug Safety and KT Gen AI Lab. Key projects include:

  • Developing a novel knowledge editing method for domain-specific applications without retraining models.
  • Designing automatic attack detection frameworks (harmfulness/bias detection) using red-teaming techniques.
  • Creating advanced RAG agent systems capable of dynamic knowledge retrieval for real-time decision-making tasks.

University of South Florida (2019 – 2023)

During my B.S. in Computer Science, I worked under Dr. Wanwan Li on augmented reality systems, focusing on automatic room mapping using SLAM algorithms. Additionally, I collaborated with Dr. Edwin Michael to develop agent-based models for pandemic simulations in Hillsborough County, contributing to public health planning during COVID-19.

🌐 Connect with Me

Pinned Loading

  1. sil_evolving_dataset sil_evolving_dataset Public

    Self-Improving Leaderboard Evolving Dataset

    Python

  2. model-explorer model-explorer Public

    Interactive TUI for Transformer Model Analysis

    Python 1

  3. SAELens SAELens Public

    Forked from jbloomAus/SAELens

    Training Sparse Autoencoders on Language Models

    Jupyter Notebook

  4. makeMoE makeMoE Public

    Forked from AviSoori1x/makeMoE

    From scratch implementation of a sparse mixture of experts language model inspired by Andrej Karpathy's makemore :)

    Jupyter Notebook

  5. ARENA_3.0 ARENA_3.0 Public

    Forked from callummcdougall/ARENA_3.0

    HTML

  6. nixos-dotfiles nixos-dotfiles Public

    Nix