NeuroAI-Cognition-Hub

Neuro-Symbolic AI and Cognition Links

Welcome to the Neuro-Symbolic AI and Cognition Links repository. This repository is a curated collection of resources, links, and references related to the exciting field of neuro-symbolic artificial intelligence, with a specific focus on cognition within AI. Whether you're a researcher, student, or AI enthusiast, this repository aims to provide valuable information and references to help you dive deeper into this evolving field. Dive into the world of cognitive models, architectures, and AI systems that aim to simulate human-like thinking processes. Stay up-to-date with the latest developments and research in this cutting-edge domain, and contribute to our growing repository to help advance the field even further. Join us in the exploration of the fascinating intersection of neuroscience, symbolic reasoning, and AI cognition.

About Neuro-Symbolic AI and the Common Model of Cognition
Cognition in AI - 2024 synopsis
Featured
Survey papers
Symbolic Language
Symbolic Reasoning AI major projects
Neuro-symbolic AI major projects
Knowledge representation major projects
Cognitive Architectures and Generative Models
Common Model of Cognition
Memory AI major projects
Meta-level control major projects
Benchmarks
Generative AI Impactful Projects
Useful AI tools (2024) - There's literally an AI for everything
Links to other useful GitHub pages
Books
Usage
Contributing
License

Up to week 10 day 00 (just started this week)

About Neuro-Symbolic AI

Neuro-symbolic AI is an interdisciplinary approach that combines symbolic reasoning with neural networks to create more advanced and intelligent AI systems. This repository will provide links to research papers, articles, and tools that explore this integration.

Cognition in AI

Cognition within AI is an essential aspect of building machines that can think and understand like humans. We've gathered resources related to cognitive models, cognitive architectures, and AI systems that simulate human-like thinking processes.

The Grounding Problem

The Common Model of Cognition

Featured

- AI model and dataset repository
- "Deep Learning Is Hitting a Wall" by Gary Marcus.
- Visit the GitHub repository for OpenCog.
- Publications on Explainable AI

Upcoming Conferences / Symposia

Conference / Symposia Name	Location	Abstract Submission Date	Paper Submission Date
AAAI-25 Fall Symposium Series	Washington, D.C., USA	~~August 7, 2024~~	~~August 15, 2024~~
EAAI-25 (Educational Advances in AI)	Philadelphia, Pennsylvania, USA	September 9, 2024	September 16, 2024
IEEE ICRA 2025 (International Conference on Robotics and Automation)	TBA	15 September 2024	15 November 2024
ICLR 2025 (International Conference on Learning Representations)	Singapore EXPO, Singapore	September 27, 2024	October 1, 2024
AISTATS 2025 (International Conference on Artificial Intelligence and Statistics)	TBA	Expected October 2024	Expected October 2024
IEEE CVPR 2025 (Conference on Computer Vision and Pattern Recognition)	TBA	Expected November 2024	Expected November 2024
AAMAS 2025 (International Conference on Autonomous Agents and Multiagent Systems)	TBA	Expected December 2024	Expected December 2024
ACL 2025 (Association for Computational Linguistics)	TBA	Expected January 2025	Expected January 2025
IJCAI-25 (International Joint Conference on Artificial Intelligence 2025)	Montreal, Canada	Expected January 2025	Expected January 2025
IEEE ISIT 2025 (International Symposium on Information Theory)	TBA	Expected January 2025	Expected January 2025
ACM SIGIR 2025 (Special Interest Group on Information Retrieval)	TBA	Expected January 2025	Expected February 2025
ICML 2025 (International Conference on Machine Learning)	TBA	Expected February 2025	Expected February 2025
ECAI 2025 (European Conference on Artificial Intelligence)	TBA	Expected March 2025	Expected April 2025
ACM SIGKDD 2025 (Knowledge Discovery and Data Mining)	TBA	Expected March 2025	Expected April 2025
ACM CHI 2025 (Conference on Human Factors in Computing Systems)	TBA	Expected Fall 2024	Expected Fall 2024
ACM SIGGRAPH 2025 (Computer Graphics and Interactive Techniques)	TBA	Expected Early 2025	Expected Early 2025
NeurIPS 2025 (Conference on Neural Information Processing Systems)	TBA	Expected May 2025	Expected May 2025

Survey Papers

Publication Year	Name	Description	GitHub Link	Summary
2024	LLM's & KG's	Unifying Large Language Models and Knowledge Graphs: A Roadmap	N/A	Summary The paper presents a roadmap for the unification of Large Language Models (LLMs) and Knowledge Graphs (KGs), aiming to leverage their strengths and overcome individual limitations for advanced AI capabilities. It identifies LLMs' general knowledge, language processing abilities, and generalizability, but also their challenges with factual knowledge, interpretability, and black-box nature. Conversely, KGs are commended for their structured knowledge representation, accuracy, and interpretability, though they face difficulties in construction and evolution. The roadmap suggests three frameworks for integration: enhancing LLMs with KGs, augmenting KGs with LLMs, and a synergized approach combining both. These integrations aim to improve the capabilities and interpretability of AI systems, addressing current challenges and exploring new research directions. This comprehensive analysis demonstrates potential applications across various domains, underscoring the significance of this unification in enhancing AI's performance and understandability.
2023	NeuralSym	A Survey on Neural-symbolic Learning Systems	N/A	Summary "A Survey on Neural-symbolic Learning Systems" examines the combination of neural networks and symbolic AI. It covers the AI evolution, integration challenges, and methods like learning for reasoning, reasoning for learning, and a joint approach. The survey highlights efficiency, generalization, interpretability improvements, diverse applications, and future research directions in AI. - Discusses neural-symbolic system integration - Explores AI evolution, challenges, and methodologies
2023	NeuroTax	Neurosymbolic AI and its Taxonomy: a survey	N/A	Summary "Neurosymbolic AI and its Taxonomy: A Survey" explores the integration of neural networks with symbolic AI towards AGI. It covers AI evolution, knowledge representation, learning and reasoning processes, and emphasizes the importance of explainability and trustworthiness in AI. The survey analyzes various neurosymbolic models, their applications, and suggests future research directions, offering an overview of neurosymbolic AI's advancements, methodologies, and prospects. - Discusses AI evolution and complexity - Analyzes neurosymbolic models and applications
2023	NeuroWave	Neurosymbolic AI: The 3rd Wave	N/A	Summary "Neurosymbolic AI: The 3rd Wave" provides a comprehensive analysis of the merging of neural networks and symbolic AI. Highlighting the field's evolution, it discusses current debates and technical aspects of neurosymbolic computing. The paper stresses the need for integrating learning and reasoning, efficiency, scalability, and trust in AI. It compares different models and discusses the AAAI 2020 conference's role in advancing neurosymbolic AI. The paper calls for leveraging both symbolic and neural approaches for more advanced AI systems. - Emphasizes integration of neural and symbolic AI - Discusses AI challenges and model comparisons
2023	GNN-NSC Survey	Graph Neural Networks Meet Neural-Symbolic Computing	N/A	Summary This survey explores the integration of Graph Neural Networks (GNNs) with Neural-Symbolic Computing (NSC). It delves into various GNN models, their application in relational learning, reasoning, and combinatorial optimization. The paper emphasizes GNNs' role in efficiency, scalability, and real-world applications, addressing integration challenges and future research directions. It highlights the importance of explainability and trust in AI systems, offering a comprehensive view of GNNs' potential in NSC. - Discusses GNN models and applications - Explores integration challenges and future directions
2022	KnowledgeGraph	A Survey on Knowledge Graphs: Representation, Acquisition, and Applications	N/A	Summary The survey on knowledge graphs offers a comprehensive review of their representation, acquisition, and applications. It delves into knowledge graph embedding, focusing on aspects like representation space, scoring functions, and encoding models, and includes auxiliary information. The study also explores knowledge acquisition, especially graph completion, through methods like embedding, path inference, and logical reasoning. It highlights applications in natural language understanding and recommendation systems. Emerging topics, such as transformer-based encoding and graph neural networks, are addressed. The survey also discusses entity discovery and neural relation extraction, incorporating modern techniques like attention mechanisms. Future research directions and resources for knowledge graph research are provided.
2021	NSAI	Neuro-Symbolic AI: An Emerging Class of AI Workloads and their Characterization	, , , , ,	Summary Neuro-symbolic AI (NSAI) represents a novel integration of traditional rules-based AI approaches with modern deep learning techniques, offering advancements in image and video reasoning while reducing the need for extensive training data. This paper provides an in-depth analysis of three distinct NSAI models: the Neuro-Symbolic Concept Learner (NSCL), Neuro-Symbolic Dynamic Reasoning (NS-DR), and Neural Logic Machines (NLM). While NSCL and NS-DR are composed of several submodels including image/video parsers and symbolic executors, NLM functions as an end-to-end model. The analysis reveals that NSAI models generally exhibit less potential for parallelism compared to traditional neural models, primarily due to their complex control flow and operations such as scalar multiplication. Data movement is highlighted as a potential bottleneck, similar to other machine learning workloads. The paper categorizes the operations within NSAI models into eight types for performance analysis, suggesting that while the neural components often dominate, there are opportunities for acceleration, especially in handling low-operational-intensity operations.

Neuro-symbolic AI major projects

Publication Year	Name	Description	Paper Link	GitHub Link	Summary
2024	neuro symbolic text game	A Hybrid Neuro-Symbolic approach for Text-Based Games using Inductive Logic Programming		N/A	Summary This paper presents a hybrid neuro-symbolic architecture for Text-Based Games (TBGs), combining symbolic reasoning with neural reinforcement learning. It uses inductive logic programming to learn symbolic rules as default theories with exceptions, enabling non-monotonic reasoning in partially observable environments. The approach employs WordNet for rule generalization, enhancing adaptability to unseen objects and scenarios. The architecture features a context encoder, action encoder, neural and symbolic action selectors, and a symbolic rule learner, with priority given to symbolic reasoning. The model outperforms traditional methods in TBGs, showing potential for future improvements in action selection and agent adaptability
2024	Plan-SOFAI	Plan-SOFAI: A Neuro-Symbolic Planning Architecture		NA	Summary Plan-SOFAI introduces a neuro-symbolic architecture for AI planning inspired by Kahneman's cognitive theory. It integrates fast (System-1) and slow (System-2) thinking models, utilizing System-1 for quick solutions based on past experiences and System-2 for logical, reasoned approaches. A metacognitive module oversees solver selection, balancing speed and accuracy. Focused on classical planning problems, Plan-SOFAI demonstrates versatility and efficiency in various testing scenarios, outperforming traditional methods in balancing solving speed and solution optimality. The architecture's adaptability allows integration of new techniques, promising broader applications beyond classical planning. Future efforts aim to enhance Plan-SOFAI's capabilities and explore new domains of application.
2023	PseudoSL	Research on Pseudo Supervised Learning			Summary The paper introduces a novel Pseudo-Semantic Loss for Autoregressive Models to incorporate logical constraints into deep learning training processes. Addressing the computational complexity of maximizing symbolic constraint likelihoods in expressive distributions like transformers, the approach employs a pseudolikelihood-based approximation around a model sample. This innovation ensures the efficient computation of neuro-symbolic losses. Empirically validated across diverse tasks like Sudoku solving, Warcraft path prediction, and language model detoxification, the method significantly improves the production of logically consistent outputs and reduces language model toxicity. This work extends neuro-symbolic learning, combining symbolic knowledge representation with neural network capabilities, and enhancing model reliability and explainability.
2023	SemStreng	Study on Semantic Strengthening			Summary The paper introduces "Semantic Strengthening of Neuro-Symbolic Learning," addressing the computational challenges in neuro-symbolic methods. It iteratively strengthens an approximation by focusing on the most relevant constraints, measured through conditional mutual information. This process ensures better alignment of gradients between constraint distributions, enhancing the model's accuracy in structured-output tasks. The approach efficiently computes mutual information using tractable circuits and maintains sound probabilistic semantics. Evaluated on complex tasks like Warcraft path prediction, Sudoku solving, and MNIST matching, the method significantly improves prediction accuracy. This work combines neural networks' feature extraction prowess with symbolic reasoning, offering a scalable solution for advanced neuro-symbolic learning.
2023	CPG	Code property graph framework for software analysis			Summary The "Compositional Program Generator" (CPG) is a novel neuro-symbolic architecture designed for efficient language processing, particularly in few-shot learning scenarios. Unlike conventional neural networks, which struggle with systematic generalization and require extensive data, CPG excels in learning new concepts with minimal examples. Its core strengths lie in three attributes: modularity, where it uses specialized modules for different semantic functions; composition, allowing the combination of modules for varied input types; and abstraction, employing grammar rules for consistent input processing. CPG's innovative approach is showcased through its remarkable performance on standard benchmarks like SCAN and COGS, achieving state-of-the-art results with drastically fewer data samples. This efficiency is facilitated by its curricular training method, which incrementally adjusts to varying sentence lengths and complexities. CPG's ability to retrain efficiently for new concepts or grammar adaptations, without forgetting previous learnings, indicates its potential for broader applications in systematic language generalization tasks.
2023	NeuroConcept	Compositional diversity in visual concept learning		N/A yet	Summary This study investigates human abilities in learning and generating novel visual concepts through compositionality, contrasting with limitations in computer vision models. Using "alien figures," it explores how humans classify and create these figures. Experiments reveal humans’ strong inductive biases and nuanced behaviors, particularly in generating novel patterns. A Bayesian program induction model captures a range of compositional behaviors, but struggles with subtle nuances observed in human responses. To address this, a generative neuro-symbolic model is introduced, blending neural networks with symbolic representations. The research underscores the complexity of human visual concept learning and the challenges in computationally modeling it.
2023	ULKB	Universal Logic Knowledge Base	N/A		Summary The Universal Logic Knowledge Base (ULKB) by IBM is a Higher Order Logic (HOL)-based framework designed for reasoning over knowledge graphs. ULKB consists of two primary components: ULKB Logic, a HOL language and interactive theorem-prover environment, and ULKB Graph, a core knowledge graph enhanced by external knowledge bases. The repository, hosted on GitHub, includes ULKB Logic's Python library source code, with examples and tutorials located in the 'examples' directory. The repository also features ontology files and graph generation code under 'graph', along with testing code. The project supports reasoning in complex knowledge domains, emphasizing logic, knowledge graphs, and theorem proving. Installation and testing instructions are provided, highlighting its ease of use for developers and researchers in knowledge representation and reasoning.
2023	IBM-LNN	Learning Neuro-Symbolic World Models with Logical Neural Networks			Summary he paper introduces a neuro-symbolic framework using Logical Neural Networks (LNN) for model-based reinforcement learning, addressing real-world problems needing explainable models and limited training data. It employs LNNs for scalable rule learning, integrated with object-centric perception modules and AI planners. The framework excels in PDDLGym environments and significantly outperforms existing agents in the TextWorld-Commonsense domain. It adeptly handles noisy data, leverages STRIPS operators for modeling actions, and enhances exploration in relational RL. This work significantly contributes to neuro-symbolic learning, showcasing its practical application in constructing robust, interpretable AI planning systems and advancing the field of artificial intelligence.
2023	IBM-Proprioception	Neuro-Symbolic World Models with Conversational Proprioception		N/A	Summary This paper presents a novel method for learning neuro-symbolic world models in text-based games using Logical Neural Networks (LNN). Focusing on the TextWorld-Commonsense set of games, it introduces the concept of conversational proprioception, enhancing model-based reinforcement learning by incorporating the memory of previous actions and constraints based on this memory. The approach significantly improves game-solving performance, as evidenced by substantial reductions in average steps and increases in average scores. Utilizing semantic parsing for logical state approximation and planning with learned logical models, the method provides better explainability and effectiveness in decision-making compared to existing neuro-symbolic agents.
2023	IBM-LOA	Learning Neuro-Symbolic World Models with Conversational Proprioception			Summary LOA (Logical Optimal Actions) is a novel reinforcement learning architecture that combines neural networks and symbolic logic for text-based interaction games. It leverages Logical Neural Networks (LNN) for logical reasoning, rule training, and improved interpretability of AI decisions. The framework focuses on language understanding, requiring skills such as long-term memory and common sense reasoning, within complex text-based game environments. LOA is demonstrated through a web-based platform, allowing users to play text-based games and visualize logical rule learning. An open-source implementation enhances its accessibility for experimentation. LOA represents a significant step in applying neuro-symbolic approaches to real-world language-based interactions.
2022	Semantic NS computing	A Semantic Framework for Neural-Symbolic Computing		N/A	Summary Simon Odense and Artur d’Avila Garcez's paper, "A Semantic Framework for Neural-Symbolic Computing," presents an innovative framework integrating neural networks and symbolic artificial intelligence. This integration addresses the limitations of both approaches, proposing a standard for encoding symbolic knowledge into neural networks. It covers key aspects like semantic encoding, probabilistic models, and semantic regularization, contributing significantly to explainable AI. The framework enables a unified understanding and comparison of diverse neural-symbolic methods. Although it faces challenges like expressive limitations, this pioneering work lays the groundwork for future advancements in neural-symbolic computing, a crucial step towards more advanced, interpretable AI systems.
2021	Neuro-Symbolic RL	Neuro-Symbolic Reinforcement Learning with First-Order Logic			Summary The paper presents a novel neuro-symbolic reinforcement learning approach for text-based games, utilizing Logical Neural Networks (LNN) to convert observations into logical facts and train interpretable policies. The method, which outperforms state-of-the-art approaches in convergence speed and interpretability, first transforms text observations into first-order logical facts with the aid of ConceptNet. LNNs are then employed to learn symbolic rules, merging neural network learning with logical reasoning. Experiments on TextWorld games across varying difficulty levels demonstrate the method’s effectiveness. The approach offers increased transparency by enabling the extraction of logical rules, addressing ethical considerations of model interpretability and ensuring computational efficiency.
2021	SLATE	Neuro-Symbolic Approaches for Text-Based Policy Learning		N/A	Summary SLATE introduces a novel neuro-symbolic approach for interpretable policy learning in text-based games, using symbolic rule learning from textual observations. It significantly improves generalization to unseen games and outperforms previous methods with fewer training games. SLATE learns interpretable and logically consistent action rules through gradient-based training, employing both MLP and LNN models. It offers clear insights into the learned policy, contributing to more reliable and ethically sound AI systems. Future work aims to extend SLATE to handle a broader range of predicates and domain-agnostic graphical states, further enhancing its applicability and effectiveness in natural language reinforcement learning.
2021	PLNN	Training Logical Neural Networks by Primal–Dual Methods for Neuro-Symbolic Reasoning			Summary "Training Logical Neural Networks by Primal–Dual Methods for Neuro-Symbolic Reasoning" explores the training of Logical Neural Networks (LNNs) under challenging constraints. The paper introduces a unified framework utilizing primal-dual optimization for this purpose, focusing on achieving convergence to KKT points in non-linear, nonconvex optimization scenarios. The proposed Inexact Alternating Direction Method of Multipliers (iADMM) effectively addresses the complexities in training LNNs by handling nonconvex inequality constraints. This advancement demonstrates superior results in training LNN models on real-world data sets, validating the approach's efficiency. The work sets a precedent for future research in optimizing LNNs, a vital tool for neuro-symbolic AI.
2020	LTN	Logic Tensor Networks			Summary Logic Tensor Networks (LTN) present a neurosymbolic AI framework, integrating logic and neural networks through a differentiable logical language, Real Logic. It grounds symbols from first-order logic onto data using neural networks, allowing rich knowledge representation and reasoning. LTN supports diverse AI tasks like classification, clustering, and regression. Learning involves parameter optimization to maximize formula satisfiability, while querying enables evaluation of truth values and inferences on new data. The framework includes methods for logical reasoning and addresses gradient issues through stable product real logic. Implemented in TensorFlow 2, LTN combines efficient computation with the expressiveness of first-order logic.

Symbolic Language

Publication Year	Name	Description	Paper Link	GitHub Link	Summary
2002-22	PropBank	PropBank (proposition bank) approach to semantic role labeling over the last two decades			Summary PropBank, evolving over twenty years, has significantly expanded its scope, encompassing non-verbal predicates such as adjectives, prepositions, and multi-word expressions, as well as a broader range of domains, genres, and languages. This expansion enhances the testing and generalization capabilities of semantic role labeling (SRL) systems. PropBank's methodology, focusing on semantic role labeling, has transitioned from relying on syntactic parses to richer semantic representations. A key component of PropBank is its Frames, housing rolesets with predicate argument structures, now essential to various meaning representations like AMR and UMR. Recent developments include the inclusion of non-verbal predicates and an extensive overhaul of the lexicon to support domain-specific annotation projects such as Spatial AMR and the THYME project. These adaptations have increased PropBank's utility and relevance in diverse NLP applications. PropBank's enhanced web presence, user-friendly tools, and accessibility improvements, such as utility scripts and a searchable frame files website, have made it a more powerful resource for researchers and practitioners in the field.
2017-22	Universal PropBank	Annotate text in different languages with a layer of "universal" semantic role labeling annotation			Summary The Universal Proposition Banks (UP) project aims to annotate texts in various languages with "universal" semantic role labeling, using English Proposition Bank's frame and role labels. UP2.0, an enhancement over v1.0, offers higher quality PropBanks using advanced monolingual SRL and improved annotation auto-generation. It expands language coverage from 7 to 23 and introduces span annotation for syntactic analysis decoupling, along with Gold data for some languages. UP2.0 is built on the Universal Dependency Treebanks release 2.9, utilizing English PropBank version 3.0 labels. The project focuses on annotating verbs with English frames, excluding auxiliary verbs, and aims to label about 90% of all verbs in each language. The annotations, mostly model-predicted, vary in quality due to domain differences. UP facilitates research in multilingual and cross-lingual SRL, with applications in advanced NLP and IBM products. Future work includes adding new languages and improving annotation quality. UP2.0 provides a valuable resource for expanded shallow semantic parsing and semantic role labeling research and applications.
2022	WANLI	Worker and AI Collaboration for NLI Dataset Creation			Summary The paper introduces WANLI, a dataset created through collaboration between human workers and AI, specifically GPT-3, for NLI tasks.
2019	CommonsenseQA	CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge			Summary CommonsenseQA is a dataset for commonsense question answering, challenging AI systems with questions requiring human-like common sense and background knowledge. This dataset, distinct from traditional QA benchmarks, contains 12,247 questions, each with five answer choices. The questions are constructed from ConceptNet, a knowledge graph of commonsense relations between concepts. To generate diverse and challenging questions, crowd-workers create multiple-choice queries that relate to a given concept from ConceptNet and include distractors. Unlike tasks where answers are found in a provided context, CommonsenseQA demands understanding of spatial relations, causes and effects, scientific facts, and social norms. Human performance on CommonsenseQA reaches 89%, while advanced models like BERT-large yield only 56% accuracy, indicating the complexity and challenge posed by the dataset. This gap highlights the difficulty machines face in mastering commonsense reasoning, a seemingly trivial task for humans. CommonsenseQA serves as a valuable tool for advancing natural language understanding models in grasping the nuances of human common sense.
2018	MultiNLI	A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference			Summary The paper introduces the Multi-Genre Natural Language Inference (MultiNLI) corpus, a comprehensive dataset designed to advance machine learning models in sentence understanding. With 433,000 examples, MultiNLI stands out in its size and the diversity of its content, encompassing ten different genres of both written and spoken English. This range allows for a broader evaluation of models, tackling the full complexity of language and providing an explicit setting for cross-genre domain adaptation studies. The dataset creation followed a methodology similar to the Stanford NLI (SNLI) corpus but expanded to include a wider array of text sources. An evaluation of existing machine learning models on this new dataset indicates that MultiNLI presents a more challenging task than SNLI, despite showing similar levels of inter-annotator agreement. The MultiNLI corpus is made publicly available, offering a valuable resource for NLU research and domain adaptation studies in machine learning.
2018	Hypothesis NLI	Hypothesis Only Baselines in Natural Language Inference			Summary The study, "Hypothesis Only Baselines in Natural Language Inference" by Poliak et al., delves into the effectiveness of a hypothesis-only model in the context of Natural Language Inference (NLI). This approach deviates from the traditional models by focusing solely on the hypothesis, ignoring the context. The researchers conducted comprehensive experiments on ten varied NLI datasets. Astonishingly, they discovered that the hypothesis-only model remarkably surpassed the majority-class baseline in several of these datasets. This unanticipated performance level raises concerns about potential statistical irregularities within NLI datasets. The study also scrutinizes how the construction method of these datasets might contribute to such irregularities, questioning whether biases in dataset creation lead to 'giveaway' terms that simplify the task for the models. An analysis of words and grammatical structures within these datasets was conducted to understand better how these factors might influence the performance of NLI models. This investigation into the lexical semantics and grammaticality of the datasets revealed underlying biases that could be exploited by models. The paper's findings highlight the need for more robust and less biased dataset construction in NLI research and pave the way for future studies aimed at reducing these biases, thereby enhancing the reliability and validity of NLI models.
2018	OpenBookQA	Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering			Summary "OpenBookQA" is a dataset designed to simulate open book exams in question answering (QA) systems, aimed at assessing the understanding of elementary-level science. It comprises roughly 6,000 multiple-choice questions and a collection of 1,326 science facts. The uniqueness of this dataset lies in its requirement for combining a given fact (e.g., metals conduct electricity) with broad common knowledge (e.g., a suit of armor is made of metal). Unlike typical QA datasets that focus on linguistic understanding, OpenBookQA demands a deeper comprehension of both the subject matter and the language used. Human performance on this dataset is close to 92%, but state-of-the-art pre-trained QA systems perform poorly, revealing a significant gap. This gap indicates the challenge in retrieving relevant information and combining it with the core fact from the open book to answer questions that require multi-hop reasoning. OpenBookQA is intended as a benchmark to drive future research in more sophisticated QA and reasoning systems.
2012-18	VerbNet	The largest online network of English verbs that links their syntactic and semantic patterns			Summary VerbNet (VN) is the largest online English verb lexicon, providing a hierarchical, domain-independent verb classification system. It extends and refines Levin's classes for syntactic and semantic coherence. Each class includes thematic roles, selectional restrictions, and frames combining syntactic descriptions with semantic predicates and temporal functions. VN integrates Levin's work with extensions by Korhonen and Briscoe, expanding its coverage, including verbs taking various complements. The integration enriches VN, increasing its classes, thematic roles, semantic predicates, and syntactic restrictions. This comprehensive Levin-style classification is essential for creating training corpora for syntactic parsers and semantic role labelers. VN's thematic role assignments and class memberships facilitate large-scale experimentation in syntax-based class utility for improving NLP tools.
2021	IWCS21	Semantic linking in computational linguistics			Summary SemLink serves as a vital bridge connecting lexical semantic resources like PropBank, VerbNet, FrameNet, and WordNet. It enables the use of each resource's unique features, thereby enhancing semantic analysis capabilities. Recent updates have focused on automatic and manual enhancements to maintain consistency across evolving resources, alongside the introduction of sense embeddings and subject/object information. SemLink's size has nearly doubled through these updates, significantly improving its coverage. Essential for research in lexical semantics, word sense disambiguation, and semantic role labeling, SemLink continues to evolve, with future efforts aimed at filling gaps through manual annotation and evaluating the utility of linked resources
2019	Wilcox19	Structural analysis in natural language processing		N/A	Summary This study investigates whether training language models with hierarchical structure supervision improves their ability to learn non-local grammatical dependencies, previously only examined for subject-verb agreement. By comparing LSTM models with two structurally supervised models – Recurrent Neural Network Grammars (RNNGs) and a version of Parsing-as-Language-Modeling – the research focuses on two grammatical dependencies: Negative Polarity licensing and Filler–Gap Dependencies. Findings reveal that structurally supervised models significantly outperform traditional LSTMs, particularly in RNNGs, which show superior understanding of complex grammatical rules. These results highlight the benefits of structural supervision in language model training, especially for complex grammatical learning
2017	Verb Physics	Relative Physical Knowledge of Actions and Objects			Summary The paper, "Verb Physics: Relative Physical Knowledge of Actions and Objects," presents an approach to infer physical properties and relationships from natural language text. This task is challenging due to reporting bias, as people rarely state obvious physical facts, like the size of a house relative to a person. However, such implicit knowledge can be deduced from how actions and objects are discussed in language. The research focuses on five dimensions of physical knowledge: size, weight, strength, rigidness, and speed. The methodology involves inferring knowledge about both object pairs and the physical implications of actions, recognizing that the way actions are described in language provides clues about the physical world. To gather data, the authors utilize syntactic patterns from natural language text, along with crowdsourced knowledge, to capture information about verbs and their implications. They choose verbs from Levin's classes based on frequency in the Google Syntax Ngrams corpus, focusing on the top 100 verbs. The approach employs a frame-centric perspective, analyzing the implications of verbs within specific contexts or frames. This method acknowledges that a single verb can have different implications based on the objects it interacts with or the context in which it is used. The main tool used for inference is a factor graph model, comprising nodes representing object pairs and action frames. This model includes elements like semantic similarities, action-object compatibility, and cross-knowledge correlation to understand the relationships between actions and objects across multiple dimensions. Experimental results demonstrate the effectiveness of this approach in predicting the physical relationships implied by various actions and between different object pairs. This finding suggests potential applications in AI, beyond language understanding, such as in computer vision and robotics. The research highlights the possibilities of extracting complex, nuanced physical knowledge from the subtleties of natural language, providing a basis for more advanced AI systems capable of intuitive reasoning about the physical world.
2017	DeepLog	Anomaly detection in system logs through deep learning			Summary DeepLog leverages a deep neural network, specifically Long Short-Term Memory (LSTM), to analyze system logs for anomaly detection. It treats logs as natural language sequences, learning normal patterns for identifying deviations. Uniquely, DeepLog continually updates its model to adapt to new log patterns over time, enhancing its relevance and accuracy. The system is capable of diagnosing anomalies by constructing workflows from logs, offering insights into potential root causes. DeepLog proves effective in various settings, including HDFS and OpenStack logs, outperforming traditional data mining methods in detecting both execution path and parameter value anomalies, showcasing its versatility and dynamic adaptability.
2017	Ordinal CSI	Ordinal Common-sense Inference			Summary The paper by Sheng Zhang and colleagues proposes an evaluation framework for automated common-sense inference, focusing on ordinal human responses and creating the JOCI corpus.
2016	PredPATT	Predicate-Argument Extraction from Universal Dependencies			Summary The paper presents Decomp's project focusing on simplifying semantic annotations, covering semantic role, event, and word sense decomposition.
2015	PyStanfordDependencies	Python Interface for Stanford Dependencies	N/A		Summary Python interface for converting Penn Treebank trees to Universal Dependencies and Stanford Dependencies.
2015	Semantic Proto-roles	Semantic proto role linking model			Summary The study presents a large-scale corpus-based validation of Dowty's proto-role theory, proposing a shift from traditional categorical roles to a property-based annotation in understanding semantic relationships. It harnesses crowdsourcing to gather data on proto-agent and proto-patient properties across sentences in the PropBank corpus. Analyzing 11 proto-role properties, the study reveals significant role fragmentation with a wide array of unique property configurations. This approach highlights the nuanced nature of semantic roles, challenging existing frameworks like VerbNet and FrameNet. The findings offer significant insights for applications in semantic parsing and role labeling, with future expansion to other languages and broader linguistic analysis anticipated.
2016	Cloze Commonsense Stories	A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories		N/A	Summary The 2016 paper by Nasrin Mostafazadeh et al., titled "A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories," addresses a critical gap in natural language processing (NLP): the evaluation of systems' understanding of causal and correlational relationships between events in narratives. Recognizing this as a foundational problem in achieving deep language understanding, the study introduces the 'Story Cloze Test'—a novel framework to assess story understanding and script learning. This test challenges a system to correctly conclude a four-sentence story with an appropriate ending, pushing the boundaries of current NLP capabilities.To support this evaluation framework, the authors developed a unique corpus named 'ROCStories,' consisting of 50,000 five-sentence commonsense stories. This corpus stands out for two primary reasons. Firstly, it captures a rich set of causal and temporal commonsense relations between daily events, which are essential for understanding narrative coherence. Secondly, it comprises high-quality, everyday life stories, beneficial not only for learning commonsense narrative schemas but also for training story generation models. One of the key findings of the paper is that existing models, predominantly relying on shallow language understanding, struggle to perform well on the Story Cloze Test. This result indicates a significant need for models that can comprehend and process language at a much deeper level, particularly in understanding the nuances of everyday events and their logical connections. The ROCStories corpus was meticulously crafted through a crowdsourcing approach, with specific guidelines ensuring coherence and realism in the stories. This process also included a rigorous quality control mechanism to maintain the high standard of the submissions. The paper concludes by underscoring the potential of the Story Cloze Test as a robust evaluation tool. It is suitable for assessing both story understanding and script knowledge learning models. The introduction of this new test and the ROCStories corpus is a stride towards fostering advancements in NLP, specifically in the realm of deeper language understanding and commonsense reasoning.
2015	NLI Corpus	A large annotated corpus for learning natural language inference			Summary The "Bowman 2015 SNLI" paper focuses on addressing the limitations of prior natural language inference (NLI) resources by introducing the Stanford Natural Language Inference (SNLI) corpus. This corpus, significantly larger than its predecessors, comprises over half a million sentence pairs written by humans. It serves as a robust platform for training and evaluating various NLI models, including neural networks and lexicalized classifiers. The paper demonstrates the importance of the SNLI corpus in enhancing model performance in understanding natural language and entailment, emphasizing the role of transfer learning. The availability of this corpus marks a significant step forward in NLI research, offering a rich dataset for future explorations in semantic representation and machine learning.
2014	CoreNLP	Stanford CoreNLP NLP Toolkit			Summary The Stanford CoreNLP toolkit provides an extensible pipeline for core NLP tasks and supports multiple languages.
2014	MORD	Ordinal Regression in Python	N/A		Summary Ordinal Regression denotes statistical learning methods for predicting discrete and ordered variables, such as movie ratings.
2013	Word2Vec	Efficient Estimation of Word Representations in Vector Space			Summary The paper introduces novel model architectures for computing continuous vector representations of words, significantly enhancing the accuracy of semantic and syntactic word similarity tasks.
2012	Pattern for Python	Web Mining Module for Python			Summary Pattern for Python by Tom De Smedt and Walter Daelemans is a versatile package for web mining, NLP, and more.
2011	COPA	Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning			Summary "Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning" by Melissa Roemmele, Cosmin Adrian Bejan, and Andrew S. Gordon focuses on evaluating commonsense reasoning in AI. They introduced the Choice Of Plausible Alternatives (COPA) evaluation, consisting of 1000 English questions assessing commonsense causal reasoning. The study aims to advance research in commonsense reasoning by providing a validated question set with broad topics, testing both forward and backward causal reasoning. The methodology involves a two-alternative forced-choice format, where each question presents a premise and two plausible causes or effects, and the correct choice is the more plausible one. The authors discuss the challenges in creating such a dataset, ensuring topic breadth, language clarity, and high agreement among human raters. They also evaluate multiple baseline approaches using statistical NLP techniques, providing initial benchmarks for future systems.
2011	RTE-7	The Seventh PASCAL Recognizing Textual Entailment Challenge			Summary The 2007 paper details the Seventh PASCAL Recognizing Textual Entailment Challenge (RTE-7), a pivotal event in advancing the field of Natural Language Processing (NLP). This challenge is critical for developing systems capable of determining whether the meaning of one text can be logically inferred from another, a fundamental aspect in applications such as Question Answering, Information Extraction, Summarization, and Machine Translation.RTE-7 introduced several innovations to enhance the realism and complexity of the challenge. These included the incorporation of longer text passages to better reflect real-world scenarios and a shared resource pool for participants, fostering collaboration and resource sharing. Additionally, the challenge featured a novel pilot task, encouraging systems to differentiate between unknown entailments and identified contradictions, and to provide justifications for their decisions.The challenge saw increased participation, with 26 teams submitting 44 runs, employing diverse approaches and showcasing advancements over previous iterations. The methods used ranged from deep linguistic analysis to logical inference, reflecting the growing complexity and sophistication in the field. This initiative not only served as a benchmark for textual entailment recognition but also stimulated research in semantic inference, highlighting the importance of this capability in a wide range of NLP applications. The results of RTE-7 demonstrated significant progress in textual entailment recognition, emphasizing the event's role in guiding future research directions and shaping the development of NLP technologies.
2009	Natural-Language	An extended model of natural logic		N/A	Summary Bill MacCartney and Christopher D. Manning's work presents an innovative model for natural language inference, emphasizing the identification of valid inferences through lexical and syntactic features, without requiring full semantic interpretation. This model extends the scope of natural logic by integrating aspects like semantic exclusion and implicativity. It operates by breaking down the inference problem into a series of atomic edits linking premises to hypotheses, subsequently predicting the lexical semantic relation for each edit. These relations are propagated up a semantic composition tree and then combined across the edit sequence to assess inferential validity. The implementation of this model demonstrates notable accuracy and precision, particularly highlighted by its performance on the FraCaS test suite. Additionally, the paper delves into theoretical aspects, detailing an inventory of basic semantic relations and the challenges in mapping complex semantic relationships. The practical utility of this model is underscored by its successful application in various test suites, showcasing its capability to handle intricate language inference problems.
2003	FrameNet	Linguistic theory and practice analysis			Summary FrameNet is a linguistic project that catalogs the semantic frames of English words, focusing on the roles and relationships between words in sentences. It's based on Frame Semantics, a theory positing that the meaning of a word is best understood in the context of a larger conceptual structure, or "frame". FrameNet provides structured representations of word meanings, identifying the elements and participants involved in different scenarios. This resource is valuable for natural language processing applications like semantic role labeling and text understanding, offering insights into how language constructs meaning through interrelated words and their contextual roles.
1993	Penn Treebank	Linguistic data consortium analysis		N/A	Summary The Penn Treebank project undertakes the ambitious task of building a large annotated corpus of American English, aiming to significantly progress in understanding both written text and spoken language. It involves annotating over 4.5 million words with part-of-speech and skeletal syntactic structure. The POS tagging utilizes a simplified tagset and combines automatic assignment with human correction, ensuring speed, consistency, and accuracy. Bracketing, following a similar process, employs the automatic parser Fidditch for initial analysis. The project's output, crucial for linguistic research and natural language processing, plans future enhancements for more detailed annotations and addressing complexities in linguistic structures.
1990	Parsons1990	Events in the Semantics of English: A Study in Subatomic Semantics		N/A	Summary Terence Parsons' 1990 paper on "Events in the Semantics of English" delves into the Neo-Davidsonian approach to event semantics. It highlights the complexities of semantic representations, addressing how verbs and their associated roles in sentences form a logical structure. The paper solves issues like variable polyadicity, handling multiple roles, and missing roles (e.g., in dream scenarios). It distinguishes between core and non-core roles and utilizes explicit quantification of underlying events, advancing Davidson's 1967 theory by treating all roles as independent conjuncts. The work impacts NLP resources and challenges our understanding of language's logical structure.
1967	Davidson1967	The Logical Form of Action Sentences		N/A	Summary In "The Logical Form of Action Sentences," Davidson delves into the intricate nature of sentences that describe actions, challenging traditional approaches to their analysis. The essay begins by examining the conventional treatment of action verbs in standard predicate logic, where Davidson finds significant shortcomings. He argues that such an approach fails to capture the complexity of action sentences, particularly in attributing a singular term to actions. Davidson's exploration leads him to question how we understand agency within these sentences. He differentiates between the agent – the doer of the action – and the action itself, proposing that actions can be viewed as events with various descriptions. This perspective allows for a more nuanced analysis of action sentences, considering the multifaceted ways in which actions are expressed and understood. Throughout the essay, Davidson critiques other philosophers' approaches to the logical form of action sentences. He specifically addresses the works of Kenny and Chisholm, pointing out the limitations in their analyses and suggesting that their methods fail to accommodate the complexity of action verbs and their modifiers. In response to these critiques, Davidson proposes a new approach. He suggests considering the inclusion of events as entities in the analysis of action sentences, thereby providing a richer and more accurate representation of their logical form. This perspective enables a better understanding of how intentional actions are expressed in language and their semantic implications. The essay culminates in a discussion on the broader grammatical and semantic consequences of this new approach to analyzing action sentences. Davidson's innovative perspective sheds light on the intricate nature of language and the way we understand actions and intentions within it, offering valuable insights into the philosophy of language.

Symbolic Reasoning AI major projects

Publication Year	Name	Description	GitHub Link	Summary
2023	Cyc by Cycorp	Getting from Generative AI to Trustworthy AI: What LLMs might learn from Cyc		Summary 10 Point Summary: 1. Addresses limitations of LLMs in trustworthiness and reasoning. 2. Proposes integration with symbolic AI, using Cyc as an example. 3. Combines strengths of LLMs and symbolic AI. 4. Discusses 16 elements necessary for trustworthy AI. 5. Emphasizes need for explicit knowledge representation. 6. Highlights importance of reasoning, world models, higher-order logic. 7. Cyc's common-sense knowledge can enhance LLMs. 8. Suggests improved explanation, deduction, induction capabilities. 9. Aims to overcome LLMs' shortcomings in complex language. 10. Envisions more reliable, interpretable AI systems. Detailed Summary: The 2023 paper "Getting from Generative AI to Trustworthy AI: What LLMs might learn from Cyc" by Doug Lenat and Gary Marcus discusses the limitations of current Large Language Models (LLMs) in trustworthiness and reasoning. It proposes integrating LLMs with symbolic AI systems like Cyc to address these limitations. The paper outlines 16 key elements necessary for a trustworthy AI, emphasizing the need for explicit knowledge, reasoning, world models, and higher-order logic capabilities. It suggests that integrating LLMs with systems like Cyc, which can handle complex reasoning and possess extensive common-sense knowledge, could lead to more reliable and interpretable AI systems. This approach aims to overcome the shortcomings of LLMs, particularly in areas like explanation, deduction, induction, and handling of complex language structures.
2023	ARC	Abstraction and Reasoning Challenge	,	Summary The paper discusses using GPT-4 for solving the ARC Challenge, emphasizing learning from limited samples. It contrasts this with traditional deep learning's reliance on extensive data. The method involves reverse engineering instructions from input-output pairs and applying these to test inputs. The research highlights the importance of self-supervised learning in understanding complex structures and demonstrates LLMs' capabilities in zero-shot and few-shot learning. It also explores chain-of-thought prompting for multi-step reasoning. The study underscores the role of human-like biases in learning and suggests improvements like multi-agent systems and memory usage for better performance, aiming to solve a majority of ARC tasks with GPT-4.
2023	AMR (3.0)	Abstract Meaning Representation (Bank)		Summary Abstract Meaning Representation (AMR) is a semantic representation language capturing the essence of English sentences to advance natural language understanding and generation. It offers a unified format for semantically annotating sentences, inspired by the success of syntactic treebanks. AMR's graph-based structure, emphasizing readability and simplicity, abstracts from syntax to focus on meaning. Utilizing tools like a power editor and the 'smatch' script, AMR ensures consistent and efficient annotation. With an expanding AMR bank and applications in diverse areas including machine translation, AMR is poised to significantly impact the field, driving future research and practical applications in natural language processing.
2023	AMR (3.0)	Abstract Meaning Representation (AMR) Annotation Release 3.0		Summary Abstract Meaning Representation (AMR) Annotation Release 3.0, developed by Linguistic Data Consortium and partners, contains a sembank of over 59,255 English sentences from various sources for advancing natural language processing. This release enhances quality, adds annotations, and includes multi-sentence annotations. AMR represents sentence meanings as graphs, abstracting from syntax to focus on semantic structure. The data, sourced from diverse programs and including new text types, is split into training, development, and test partitions. This release is part of ongoing efforts to enrich machine translation and other applications through improved semantic understanding. /details>
2023	Logic & Info	Towards a Unification of Logic and Information Theory	N/A	Summary This work explores the efficient transmission of logical statements, innovatively connecting logic and information theory. It treats logical statements as equivalent if they lead to the same deductions, applying rate-distortion theory for communication. Using propositional logic modelled as polynomial equations in finite fields, it investigates various scenarios, including transmissions with and without pre-existing background statements. Key contributions include demonstrating the optimality of incremental communications and the surprising "less is more" theorems. The study also delves into the partition compression problem, providing theoretical bounds and proposing linear codes for practical implementation, achieving asymptotic optimality under certain conditions.
2023	MinePlnr	Benchmark for long-horizon planning in Minecraft worlds		Summary MinePlanner introduces a benchmark for testing AI planning capabilities in large, complex Minecraft worlds. Comprising 45 tasks with escalating difficulty, it assesses planners' abilities to navigate environments dense with objects, many of which are irrelevant to task goals. The benchmark challenges current planners, including Fast Downward and ENHSP-20, revealing significant limitations in handling large-domain problems. MinePlanner highlights the need for advancements in planning algorithms to cope with real-world scales and complexity. This framework not only serves as a testbed for AI planning research but also aims to foster collaboration between the learning and planning sectors in AI.
2023	GenPlan23	Abstract world models for value-preserving planning	N/A	Summary Learning Abstract World Models for Value-preserving Planning by Rafael Rodriguez-Sanchez and George Konidaris explores the development of abstract Markov Decision Processes (MDPs) for advanced planning in reinforcement learning. The paper addresses the challenge of complex decision-making in general-purpose agents by proposing state and action abstractions within abstract MDPs. This approach ensures effective planning and decision-making while maintaining compatibility with specific tasks. By leveraging information maximization and deep learning methods, the paper demonstrates improved planning efficiency in various environments. The research contributes significantly to the field by blending theoretical and empirical insights for effective abstract model learning and planning.
2023	D3A	Building long-term 3D semantic maps		Summary Introduces an algorithm for building dynamic 3D semantic maps, crucial for household robots in unstructured environments.
2023	Plansformer tool	Plansformer Tool: Demonstrating Generation of Symbolic Plans Using Transformers	N/A	Summary "Plansformer Tool: Demonstrating Generation of Symbolic Plans Using Transformers" introduces Plansformer, a novel AI tool leveraging transformer-based language models to generate symbolic plans. Fine-tuned on classical planning domains, Plansformer demonstrates effective plan generation, evaluated through metrics like ROUGE and BLEU, and validated for optimality. It surpasses other models in generating valid plans for simple domains but faces challenges in more complex scenarios. Offering a user-friendly web interface, Plansformer enhances planning efficiency in diverse fields. The research opens new pathways for utilizing large language models in symbolic domains, signifying a step forward in AI planning technologies.
2022	compositional generalisation	Compositional generalization through abstract representations in human and artificial neural networks	N/A	Summary This study explores compositional generalization in humans and artificial neural networks (ANNs) through a highly compositional task. By examining human behavior and neural correlates using fMRI, it identifies behavioral patterns of compositional generalization. The study also incorporates pretraining paradigms in ANNs, embedding prior knowledge of the task to improve performance. Results indicate that pretraining induces abstract internal representations in ANNs, leading to enhanced generalization and learning efficiency. It also reveals a content-specific topography of abstract representations across the human cortex. This research provides empirical evidence for the role of abstract representations in compositional generalization, with implications for ANN design and training.
2022	Plansformer	Plansformer: Generating Symbolic Plans using Transformers	N/A	Summary Introduces Plansformer, an LLM fine-tuned on planning problems, demonstrating high performance in various planning domains.
2022	Kogito	A Knowledge-Grounded Neural Conversational Model		Summary "kogito" by Mete Ismayilzada and Antoine Bosselut is an innovative toolkit designed to generate commonsense knowledge inferences from textual input. It integrates with natural language generation models, offering a customizable and extensible interface for inference generation. The toolkit includes a library of pre-trained models like GPT-2, GPT-3, and COMET, and features modules for head extraction, relation matching, and inference filtering. kogito allows users to define custom knowledge relations, enhancing its adaptability. Accompanied by extensive documentation, the tool aims to standardize and simplify the process of generating meaningful commonsense inferences, paving the way for more intelligent and context-aware AI systems.
2022	RoboCat	A Category Theoretic Framework for Robotic Interoperability Using Goal-Oriented Programming	N/A	Summary RoboCat, developed by Angeline Aguinaldo and colleagues, introduces a novel framework for robotic programming, leveraging category theory for enhanced interoperability and usability. It transforms the way robots are programmed by adopting a goal-oriented, declarative approach that utilizes high-level abstractions rather than focusing on low-level, vendor-specific programming. By defining hierarchical interfaces and using mathematical representations, RoboCat simplifies integrating new hardware and software into robotic systems. This framework not only eases the programming process but also ensures consistency across different robotic platforms, making it a significant step towards more adaptable and efficient robotic applications in various industries.
2022	LinkBERT	Pretraining Language Models with Document Links		Summary "LinkBERT: Pretraining Language Models with Document Links" by Michihiro Yasunaga, Jure Leskovec, and Percy Liang proposes a novel pretraining approach for language models, utilizing document links to capture inter-document relationships. LinkBERT, by integrating these links, effectively enhances language understanding and reasoning capabilities, especially in tasks requiring multi-hop reasoning and document relation comprehension. Tested in both general and biomedical domains, LinkBERT demonstrates notable improvements over traditional models like BERT, particularly in complex tasks and few-shot learning scenarios. This approach sets new benchmarks in biomedical NLP tasks, showcasing the immense potential of document-link-based pretraining in language models.
2021	LCN	Logical Credal Networks		Summary "Logical Credal Networks" by Haifeng Qian et al. presents a novel approach to probabilistic logic modeling, addressing the challenge of representing imprecise information in AI applications. LCNs allow for the use of arbitrary logic formulas with probability bounds, overcoming the limitations of traditional models in expressiveness and dependency representation. Featuring a unique Markov condition, LCNs can manage cyclic dependencies, making them versatile for real-world scenarios like Mastermind puzzles and credit card fraud detection. The experimental results demonstrate LCNs' superiority in aggregating diverse information sources, marking a significant advancement in probabilistic logic models and their application in uncertain environments.
2021	DeepStochLog	Neural Stochastic Logic Programming		Summary "DeepStochLog: Neural Stochastic Logic Programming" introduces a novel framework that combines neural networks with stochastic logic programming, offering a scalable and expressive approach to neural-symbolic computation. DeepStochLog enhances traditional logic programming with neural networks, improving its capabilities in complex tasks such as MNIST Digit Addition and Handwritten Mathematical Expressions. The framework not only demonstrates state-of-the-art performance in various neural-symbolic tasks but also excels in scalability and efficiency. DeepStochLog's ability to encode complex relational problems beyond standard grammars opens new possibilities in AI, highlighting its significance in advancing neural-symbolic integration.
2021	Structure-aware-BART	Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs		Summary "Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs" by Jiaao Chen and Diyi Yang introduces an innovative summarization model that integrates discourse relations and action graphs into conversation summaries. The model addresses the unstructured and complex nature of human conversations, improving the preciseness of summaries. By constructing discourse relation graphs and action graphs, the model effectively captures dependencies between utterances and associations between speakers and actions. Empirical tests show significant improvements over existing methods in both automated and human evaluations. This approach opens new avenues for accurate and informative conversation summarization in various domains.
2021	2P-Kt	A logic-based ecosystem for symbolic AI	,	Summary "2P-Kt: A logic-based ecosystem for symbolic AI" by Giovanni Ciatto et al., presents 2P-Kt, a significant evolution of the tuProlog project, aimed at creating an open, modular, and interoperable ecosystem for logic programming and symbolic AI. 2P-Kt leverages Kotlin multiplatform technology, offering a wide range of functionalities such as logic term manipulation, unification, and solver engines. It supports different logic programming paradigms and facilitates the integration of symbolic and sub-symbolic AI. The platform is designed to be extensible, encouraging further developments in AI research and applications. 2P-Kt's open-source nature and comprehensive toolkit make it a valuable asset for both researchers and practitioners in AI.
2020	SenseBERT	Driving Some Sense into BERT		Summary SenseBERT, an extension of BERT, introduces a novel approach to lexical semantics by incorporating word sense information directly into the pre-training process. Using WordNet's supersense categories for weak-supervision, it enhances BERT's capability to understand and predict word meanings in context. This results in significantly improved performance on lexical tasks like Word Sense Disambiguation and the Word in Context (WiC) task, while maintaining competitive performance on standard benchmarks like GLUE. The model effectively manages word ambiguity and out-of-vocabulary words, showing potential for richer semantic learning in language models without relying on human-annotated data. SenseBERT represents a significant advancement in harnessing external linguistic knowledge sources for neural language models.
2019	CILP	Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning	N/A	Summary "Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning" outlines an approach that merges neural network-based learning with symbolic logic-based reasoning, addressing the interpretability and accountability issues in AI. It covers different strategies for representing symbolic knowledge in neural networks and discusses the integration of various logic types for robust reasoning and learning. The paper highlights the importance of explainable AI, focusing on knowledge extraction, natural language generation, and program synthesis. It presents neural-symbolic learning and reasoning as key to developing intelligent systems that are not only efficient and effective but also interpretable and accountable.
2019	BART (Meta)	Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension		Summary "BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension" introduces BART, a denoising autoencoder that pre-trains sequence-to-sequence models by corrupting and reconstructing text. It employs a Transformer-based architecture, combining aspects of BERT and GPT. BART's flexible noising strategy includes token masking, deletion, and text infilling, making it versatile for various tasks. It excels in text generation, comprehension tasks, and machine translation, matching or surpassing benchmark performances like RoBERTa's on GLUE and SQuAD. Ablation studies within the BART framework indicate its consistency across tasks. BART's scalability in pre-training shows significant potential for wide applications in natural language processing.
2018	DeepProbLog	Neural Probabilistic Logic Programming	,	Summary DeepProbLog introduces a novel approach to neural probabilistic logic programming, effectively integrating deep learning with probabilistic reasoning. It extends the probabilistic logic programming language ProbLog with neural predicates, handling uncertainty from neural network outputs within a logical reasoning framework. The language supports both symbolic and subsymbolic reasoning, allowing end-to-end model training on complex tasks that require more than just standard learning. Through gradient descent-based learning, DeepProbLog jointly trains parameters in logic programs and neural networks. This integration leverages the strengths of both neural networks and logical reasoning, making DeepProbLog a powerful tool for advanced artificial intelligence applications.
2018	Sem-Loss Sym-Knowledge	A Semantic Loss Function for Deep Learning with Symbolic Knowledge		Summary "A Semantic Loss Function for Deep Learning with Symbolic Knowledge" introduces a novel approach to integrate deep learning with symbolic logic through a semantic loss function. This function evaluates the conformity of neural network outputs to logical constraints, enhancing learning in both semi-supervised and structured prediction tasks. The method demonstrates significant improvements in classification accuracy on datasets like MNIST and CIFAR-10. It is particularly effective in complex structured prediction, such as preference rankings or path predictions. The semantic loss function is developed axiomatically, ensuring logical soundness and differentiability, and can be incorporated into standard deep learning models as an additional regularization term.
2018	BERT (Google)	Pre-training of Deep Bidirectional Transformers for Language Understanding		Summary BERT is a transformative language representation model that pre-trains deep bidirectional representations from the unlabeled text by considering the context from both sides of a token in all layers. This novel approach enables it to achieve remarkable results in eleven NLP tasks, surpassing existing benchmarks significantly. BERT operates through two pre-training tasks: masked LM and next sentence prediction, facilitating effective language model pre-training. Its architecture is a multi-layer bidirectional Transformer encoder, capable of handling both single and paired text inputs. This versatility, combined with minimal architectural adjustments for various downstream tasks, makes BERT exceptionally powerful and adaptable for a wide range of NLP applications.
2017	AMR Parsing	Transition-based Neural Parser		Summary This paper introduces a novel, efficient approach to parsing Abstract Meaning Representation (AMR) through an incremental, transition-based parser. Adapting techniques from dependency parsing, it addresses unique AMR challenges like non-projectivity, reentrancy, and complex word-to-node alignments. The parser processes sentences left-to-right in linear time, enhancing efficiency. The proposed evaluation metrics assess specific sub-tasks, providing deeper insights into parser performance across various aspects, including named entity recognition and negation handling. The parser shows competitive results, particularly in recovering named entities, despite not achieving the highest overall Smatch score, showcasing the potential for real-time applications and studies in building sentence meaning incrementally.
2014	cplint	A framework for reasoning with Probabilistic Logic Programming		Summary "cplint on SWISH" is a cutting-edge web application enabling probabilistic logical inference directly in a web browser. It significantly simplifies user interaction by obviating the need for complex installations. The platform supports advanced features such as hybrid programs with both discrete and continuous variables, a unique offering in web-based probabilistic logic programming. It includes various inference algorithms like rejection sampling and Metropolis-Hasting for robust reasoning. Notably user-friendly, it provides a diverse array of examples across different probabilistic models, demonstrating the system's versatility and capability. Future enhancements promise to expand its functionalities, making it an increasingly powerful tool for probabilistic programming.
2007	Modal	Connectionist modal logic: Representing modalities in neural networks		Summary The paper "Connectionist Modal Logic: Representing Modalities in Neural Networks" introduces a novel framework that integrates neural networks and modal logic, referred to as Connectionist Modal Logic (CML). This approach is part of the broader domain of neural-symbolic integration, aiming to utilize symbolic knowledge within the neurocomputing paradigm. CML facilitates the representation and learning of propositional modal logic using neural networks. The framework is validated through the application to the Muddy Children Puzzle, demonstrating its potential in distributed knowledge representation. The paper suggests that CML offers a balance between expressive power and computational feasibility and hints at the possibility of extending beyond propositional logic to first-order logic.
2003	DAMLJessKB	A Tool for Reasoning with the Semantic Web	N/A	Summary "DAMLJessKB: A Tool for Reasoning with the Semantic Web" by Joseph Kopena and William C. Regli presents DAMLJessKB, an innovative tool designed for reasoning with Semantic Web technologies. The tool focuses on integrating Semantic Web standards like DAML for effective knowledge management and inference in domains such as engineering design. Using the Jess production system, DAMLJessKB interprets and processes RDF/DAML encoded data, enabling sophisticated reasoning capabilities, including class instance and terminological reasoning. The tool's practical applications, ease of integration, and potential enhancements to support OWL, make it a significant step towards realizing the Semantic Web vision.

Knowledge representation major projects

Publication Year	Name	Description	Paper Link	GitHub Link	Summary
2023	ReckonMK	Reckoning MetaKG: Hyper-Edge Prediction in the Meta Knowledge Graph			Summary RECKONING introduces an innovative approach to enhance the reasoning abilities of transformer-based language models. It encodes contextual knowledge directly into the model's parameters using a bi-level optimization process, consisting of an inner loop for rapid adaptation and an outer loop for optimizing initial weights. This method improves robustness against distractors and enhances reasoning performance. Extensive testing on datasets like ProofWriter, CLUTRR-SG, and FOLIO shows that RECKONING outperforms traditional in-context reasoning, particularly in handling irrelevant facts and generalizing to longer reasoning chains and real-world knowledge. Additionally, it proves more efficient in scenarios involving multiple questions due to its single-time knowledge encoding.
2023	PeaCok	Persona Commonsense Knowledge for Consistent and Engaging Narratives			Summary PEACOK is a novel knowledge graph offering comprehensive persona commonsense knowledge to enhance narrative systems. Containing around 100K human-validated facts across five key dimensions, it was constructed using a combination of existing commonsense knowledge graphs and inputs from large-scale language models, refined through a human-AI majority voting process. This resource provides deep persona insights, fostering more consistent and engaging narratives in applications such as dialogue systems. While showcasing the potential of integrating machine-generated and human-validated content, PEACOK also acknowledges its linguistic and ethical limitations, primarily its focus on English data and reliance on existing knowledge sources. v
2023	KOALA	A Dialogue Model for Academic Research	(blog post only)		Summary Koala is a new dialogue model developed by fine-tuning Meta's LLaMA with web-gathered dialogue data, aiming to match the capabilities of larger closed-source models. Focusing on high-quality, diverse data including user-shared dialogues with ChatGPT, Koala demonstrates competitive performance, sometimes preferred over Stanford’s Alpaca and equaling ChatGPT in user studies. This suggests smaller, local models with carefully curated data can achieve results similar to larger ones, highlighting the importance of dataset quality over size. However, Koala, being a research prototype, has limitations in content, safety, and reliability and is recommended only for academic research. Koala's release includes an interactive demo, training framework, model weights, and a test set, intended for academic use under specific licenses. The project, a Berkeley AI Research Lab (BAIR) initiative, encourages community feedback to identify potential improvements and safety issues.
2023	Wolfram Alpha	Wolfram Data Framework (WDF) Take data and make it meaningful			Summary Wolfram Data Framework (WDF) utilizes the Wolfram Language and Knowledgebase for a standardized computable description of real-world data. It includes functions for importing, and interpreting data, and structures like Integer, Entity, Quantity, and CloudObject. WDF supports data representation in Lists, Associations, Datasets, and Graphs, with specific functions for managing custom entities and their permissions.
2023	AI companions	Various approaches from various companies	N/A		Summary AI companions like Kindroid AI, Nomi AI, and Character AI represent the evolving landscape of personalized digital assistants. Kindroid AI offers a customizable AI companion that can engage in text chat, provide AI selfies, and features human-like voices. It acts as a personal assistant, handling reminders and schedules while ensuring privacy and security of chats【27†source】. Nomi AI focuses on personalized conversations and emotional support, evolving through interactions to understand user preferences and styles, enhancing emotional bonds. It supports setting reminders and discussing a wide range of topics, underlining the role of AI in personal growth and mental health support【31†source】. Unfortunately, I couldn't retrieve specific information about Character AI. These platforms exemplify the trend towards more interactive, emotionally intelligent AI companions in daily life.
2023	SKIP	Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference			Summary SkipDecode is an innovative token-level early exit method for large language models, enhancing efficiency in batch processing and KV caching. It overcomes limitations of existing methods by implementing unified exit points and a monotonically decreasing number of exits as sequences progress. This design optimizes batched inference and eliminates the need for recalculating KV caches. The method achieves substantial inference speedups (2x to 5x) with minimal performance loss. Tested on OPT models with extensive datasets, SkipDecode maintains a controlled computational budget and outperforms similar techniques. Future improvements may include alternative decay functions and policy extensions to prompts.
2023	TextWorldExpress	Simulating Text Games at One Million Steps Per Second			Summary TEXTWORLDEXPRESS is a high-performance simulator that significantly advances text-based game simulations for virtual agent research. It dramatically increases simulation throughput to over one million steps per second, enabling complex billion-step experiments in about a day. This tool enhances agent evaluation in tasks requiring language understanding, problem-solving, and reasoning. It outperforms existing simulators like TextWorld and Jericho in speed, offering rapid simulations on standard desktop hardware. Although it has some limitations, such as requiring SCALA for new environment additions, its contribution as an open-source tool represents a significant leap forward in embodied agent research and text-based game simulations.
2023	ParlAI	Python framework for sharing, training and testing dialogue models, from open-domain chitchat	N/A (Meta)	,	Summary ParlAI is a unified Python-based platform designed for dialogue AI research. It offers a one-stop shop for researchers with access to over 100 popular language datasets and a wide set of reference models, including pre-trained ones. Key features include seamless integration with Amazon Mechanical Turk and Facebook Messenger for data collection, training, and human evaluation. Originally open-sourced by Facebook in 2017, ParlAI supports various project domains like generative and retrieval models, interactive learning, and offensive language recognition. It allows training and evaluating dialogue models on diverse tasks and supports multi-modality for tasks involving both text and images. For more details, one can check ParlAI's GitHub repo, which includes installation guides, documentation, and examples.
2023	DWD overlay	The DARPA Wikidata Overlay			Summary The DARPA Wikidata Overlay (DWD Overlay) is an enriched subset of Wikidata, integrated with PropBank roles to enhance natural language processing ontologies. It addresses Wikidata's entity-centric focus by adding detailed event and action descriptions, including temporal relations based on Allen Interval Temporal Logic. The overlay is accessible in JSON format, facilitating easy integration into various applications, particularly for event extraction and narrative understanding. The project involves aligning Wikidata's entity-oriented structure with PropBank's event-focused roles, presenting unique challenges. Future work includes fully integrating the overlay into Wikidata and expanding its scope and accuracy.
2023	HITL-Schema	Human-in-the-Loop Schema Induction			Summary The Human-in-the-Loop Schema Induction system innovatively combines the capabilities of GPT-3.1 with human input to create detailed and accurate event schemas, crucial for event-centric natural language understanding (NLU). This approach overcomes the limitations of fully manual or automated systems by ensuring scalability and quality. The system's process involves step generation, node extraction, graph construction, and node grounding. Its interactive interface simplifies user involvement in schema editing and generation. While acknowledging the cultural biases inherent in the data and models used, the system aims for inclusive representation in diverse domains, enhancing tasks like misinformation detection and question answering.
2023	RAMFIS	REPRESENTATION OF VECTORS AND ABSTRACT MEANINGS FOR INFORMATION SYNTHESIS		N/A (U.S. Air Force)	Summary The RAMFIS project, part of the AIDA program, aimed at improving multimodal information processing by synthesizing diverse data sources to identify contradictions and confirmations. Initial focus was on mapping multi-modal vector representations to the LDC's AIDA Ontology, later shifting to text vector representations. Key tasks included entity and event linking across documents, development of a cross-program ontology transitioning from the LDC Ontology to Wikidata, and enhancing coreference resolution using embedding-based methods. The project also involved face embedding mappings, exploring transformer embeddings, and error analysis using tools like Brandeis Explorer. Contributions from various teams were integral to achieving RAMFIS's multifaceted goals.
2022	ScienceWorld	ScienceWorld: Is your Agent Smarter than a 5th Grader?			Summary ScienceWorld is an interactive text-based environment designed to test AI agents' scientific reasoning skills at an elementary level. It features 30 benchmark tasks across various science topics, challenging agents to apply knowledge in a dynamic, procedural context. The environment, resembling a house with multiple rooms and objects, supports diverse actions and simulates processes like thermodynamics and electricity. Empirical tests reveal that current language models struggle with these tasks, indicating the need for better integration of declarative and procedural knowledge in AI. ScienceWorld's unique approach has broad implications for AI research, emphasizing the development of more advanced, reasoning-capable models.
2022	D&D as a Dialogue	Dungeons and Dragons as a Dialog Challenge for Artificial Intelligence		N/A	Summary The study introduces Dungeons and Dragons (D&D) as an advanced AI challenge, emphasizing dialogue systems and interactive storytelling. Utilizing a comprehensive gameplay dataset, a large language model was trained to generate game dialogues and predict game states, effectively role-playing as characters or the Dungeon Master. The model's output was evaluated through human assessments, focusing on the authenticity and appeal of its dialogue. This research underscores AI's potential in complex, narrative-driven environments and marks a significant step in AI’s application in interactive storytelling, presenting both challenges and opportunities for future developments in the field of AI-driven narrative generation and game design.
2022	QALD	QALD-9-plus: A Multilingual Dataset for Question Answering over DBpedia and Wikidata Translated by Native Speakers			Summary QALD-9-plus, an enhancement of the QALD-9 dataset, introduces high-quality translations of questions into 8 languages, including several low-resource ones, and transfers SPARQL queries from DBpedia to Wikidata. This extension, done through native speakers and crowdsourcing, addresses flaws like poor translations and ambiguous questions, increasing the dataset's usability and relevance. It enables more comprehensive evaluation of KGQA systems, maintaining QALD-JSON format for ease of use. The project aims to further expand language coverage and question quantity, contributing significantly to the multilingual accessibility of KGQA systems and setting a benchmark for future KGQA research.
2022	SAGA	A Platform for Continuous Construction and Serving of Knowledge At Scale			Summary Saga is a next-generation platform by Apple for constructing and serving knowledge graphs at an industrial scale. It continuously integrates data to create a central knowledge graph, addressing challenges like data freshness, accuracy, and provenance management. The platform uses a hybrid batch-incremental design and a federated polystore approach for diverse workloads, including a specialized graph query engine. Key features include live graph components for real-time data integration and graph machine learning for entity recognition and duplicate detection. Saga supports applications such as open-domain question answering and semantic annotations, continuously expanding and refining its knowledge base through collaborative efforts.
2022	Foresee	Knowledge-Based News Event Analysis and Forecasting Toolkit		N/A	Summary The toolkit presented is designed for knowledge-based analysis and forecasting of news events, powered by a Knowledge Graph (KG) developed from various sources. It retrieves ongoing news, identifies relevant events, and performs causal analysis to forecast potential outcomes. The toolkit maps news headlines to KG event types, uses causal knowledge extraction to enrich the graph, and includes event sequence analysis for comprehensive forecasting. Initially constructed from Wikidata, it focuses on significant societal events like disease outbreaks and natural disasters. The toolkit's interactive KG visualization aids in effective analysis, and a planned demonstration will showcase its capabilities using recent real-world events.
2021	Human Schema	Human Schema Curation via Causal Association Rule Mining			Summary Presents a semi-automatic system for curating a library of structured, machine-readable event schemas using Causal Association Rule Mining and a novel annotation interface, SchemaBlocks. This system produces 232 detailed event schemas, each defining a typical real-world scenario in terms of events, participants, and relationships. Combining automated script induction with human-driven annotation, the approach efficiently creates high-quality schemas. The schemas, rooted in the DARPA KAIROS Phase 1 ontology, are evaluated using a schema intrusion task and corpus coverage metric, demonstrating their coherence and broad applicability. The research contributes significantly to structured knowledge representation, releasing these resources for public use.
2021	Few Shot Comet	Analyzing Commonsense Emergence in Few-shot Knowledge Models			Summary The study "Analyzing Commonsense Emergence in Few-shot Knowledge Models" examines the emergence of commonsense knowledge in language models (LMs) fine-tuned on knowledge graph (KG) tuples. It questions whether commonsense knowledge is inherent from pretraining or acquired during fine-tuning. Through few-shot training settings, the study finds that commonsense knowledge models adapt rapidly from limited examples, suggesting pretraining provides an encoded knowledge base, with fine-tuning forming an interface to this knowledge. The research identifies key shifts in the model's attention heads and less change in feed-forward networks, indicating an adaptation in processing knowledge rather than changing stored representations.
2021	UMR	Uniform Meaning Representation		, ,	Summary Uniform Meaning Representation (UMR) integrates and extends Abstract Meaning Representation (AMR) for annotating text semantics, emphasizing cross-linguistic application, including morphologically complex and low-resource languages. UMR extends AMR's capabilities by adding document-level features such as coreference and temporal relations. It supports both logical and lexical inference for enhanced semantic interpretation. Designed for scalability and ease of annotation, UMR is positioned to be universally applicable, adaptable to linguistic diversity, and suitable for large-scale applications. Pilot experiments indicate its robustness and reliability. Future developments include creating comprehensive annotation guidelines and tools to facilitate application across a wide range of languages.
2021	LLM Knowledge from Analogy	Combining Analogy with Language Models for Knowledge Extraction			Summary The study introduces Analogical Knowledge Extraction (AKE), a novel method combining BERT Language Model with analogical reasoning for extracting structured knowledge from text. Aimed at expanding knowledge bases with type-leveled general knowledge, AKE uses semantic parsing to create query cases that link sentence semantics to knowledge base facts. It features an ontological scoring system combined with BERT-based fact classification to ensure accuracy and relevance of extracted facts. Evaluated on Simple English Wikipedia, AKE outperforms baselines like T5 and Relation Extraction models. Future directions include processing more data and exploring richer knowledge representations and query case properties for enhancing general-purpose AI systems.
2021	MAKG/MS Satori	Microsoft Academic Knowledge Graph / Satori project			Summary The Microsoft Academic Knowledge Graph (MAKG) is a large RDF dataset with over eight billion triples, offering comprehensive information on scientific publications and related entities like authors, institutions, journals, and fields of study. Originating from the Microsoft Academic Graph, it's licensed under the Open Data Attributions license. Key features include periodically updated RDF dumps, URI resolution within the Linked Open Data framework, a public SPARQL endpoint, HTML page descriptions via pubby, and entity embeddings for all 210M scientific papers. MAKG supports various applications, such as entity-centric exploration, data integration using RDF, and data analysis for knowledge discovery in the academic domain.
2020	Comet Atomic	A Commonsense Knowledge Base for Modeling Atomic Events			Summary (COMET-)ATOMIC2020, a new commonsense knowledge graph (CSKG), addresses the quality and coverage challenges in existing CSKGs by integrating symbolic commonsense knowledge with neural language models. It emphasizes that manually constructed CSKGs might not achieve the necessary coverage for all NLP scenarios and proposes a new evaluation framework focusing on the effectiveness of KGs in aiding language models to learn implicit knowledge representations. ATOMIC2020, containing unique general-purpose commonsense knowledge, outperforms other CSKGs in training knowledge models for new entities and events. Notably, a BART-based model trained on ATOMIC2020 surpasses GPT-3 in few-shot tasks, validated by human evaluation.
2020	TWC	Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines			Summary TextWorld Commonsense (TWC) is a novel environment for evaluating text-based Reinforcement Learning (RL) agents using commonsense knowledge. TWC challenges agents with tasks like house organization, requiring understanding of object properties and relationships, sourced from ConceptNet. Agents dynamically construct a commonsense graph, integrating this knowledge with game context for decision-making. Benchmarking against human performance indicates significant potential for agent improvement. TWC's games, varying in difficulty, test agents’ adaptability and efficiency. Empirical results confirm that commonsense-enhanced agents surpass text-only counterparts, positioning TWC as a valuable platform for advancing research in RL and commonsense reasoning in AI.
2020	KBQA (IBM)	Leveraging Abstract Meaning Representation for Knowledge Base Question Answering			Summary The paper presents Neuro-Symbolic Question Answering (NSQA), a modular KBQA system that leverages Abstract Meaning Representation (AMR) for understanding complex questions in natural language. NSQA uses AMR for parsing questions, graph transformation to create logical queries aligned with the knowledge base, and Logical Neural Networks (LNN) for reasoning over these queries. It successfully tackles challenges like n-ary argument mismatches and structural discrepancies between AMR and KB queries. NSQA achieves state-of-the-art performance on DBpedia-based datasets, demonstrating its ability to handle complex multi-hop questions and unusual expressions. This system reduces the need for end-to-end training datasets, making it a significant contribution to the field of KBQA.
2020	WCEP	A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal			Summary The WCEP dataset, leveraging the Wikipedia Current Events Portal, is a significant contribution to multi-document summarization (MDS). It comprises 10200 clusters of news articles, each with a concise, human-written summary. Unique in its scale, it expands each event's coverage by including additional relevant articles from the Common Crawl archive. This large-scale dataset is specifically designed for real-world applications like news aggregation and summarization. Through empirical evaluations using various state-of-the-art MDS methods, the dataset provides a robust platform for further research. Its realism and comprehensive approach distinguish it from existing MDS datasets, making it a valuable resource for advancing automatic summarization technologies.
2019	KAIROS	Knowledge-directed Artificial Intelligence Reasoning Over Schemas	N/A		Summary The Knowledge-directed Artificial Intelligence Reasoning Over Schemas (KAIROS) program, led by Dr. Wil Corvey at the Defense Advanced Research Projects Agency (DARPA), aims to develop AI systems that can rapidly comprehend world events, crucial for U.S. national security. It seeks to construct schema-based AI that overcomes the limitations of first-wave (rule-based, symbolic reasoning) and second-wave (machine learning) AI systems. KAIROS will create and apply schemas to identify, link, and sequence events in multimedia data, focusing on events that impact national security. The program involves two stages: learning schemas from big data and applying these schemas to multimedia/multilingual information to detect complex events of interest. A KAIROS Proposers Day was held in January 2019 in Arlington, VA.
2019	KG-TE	Infusing Knowledge into the Textual Entailment Task Using Graph Convolutional Networks			Summary The paper presents a KG-Augmented Entailment Model that infuses external knowledge from knowledge graphs into the textual entailment task using Graph Convolutional Networks. It introduces a method to generate contextual subgraphs from KGs, focusing on relevance and noise reduction. These subgraphs, encoded by GCNs, are combined with text-based models to enhance classification performance. The approach significantly improves entailment prediction accuracy, particularly on the challenging BreakingNLI dataset, demonstrating robustness and resilience. The model’s modularity allows compatibility with various text-based models and KGs, addressing challenges in effectively integrating external knowledge into NLI tasks and offering significant advancement over text-only models.
2019	Comet	COMET: Commonsense Transformers for Automatic Knowledge Graph Construction			Summary COMET introduces an innovative approach to expand commonsense knowledge graphs by integrating pre-trained language models with generative techniques. It focuses on constructing new knowledge by generating nodes and edges in existing graphs like ATOMIC and ConceptNet. COMET adapts deep contextualized models like BERT for generating rich, natural language descriptions of commonsense knowledge. It achieves remarkable precision, approaching human-level performance, thus offering a promising solution to the challenge of scaling high-precision commonsense knowledge bases. This method has significant implications for AI research, suggesting a shift towards generative methods for automatic knowledge base completion over traditional extractive approaches.
2018	Atomic	ATOMIC: An Atlas of Machine Commonsense for If-Then Reasoning			Summary ATOMIC is a comprehensive atlas for machine commonsense reasoning, organized through 877,000 textual descriptions of inferential knowledge based on "if-then" relations. It introduces nine relation types, covering causes, effects, and stative aspects, related to events, mental states, and personas. Unlike traditional knowledge graphs that focus on taxonomic or encyclopedic knowledge, ATOMIC emphasizes inferential, event-based knowledge. The data, collected via crowdsourcing, contributes to an extensive coverage of everyday events. Training neural models on this dataset enhances their reasoning and inference abilities for unseen events. ATOMIC represents a significant step towards addressing the gap in commonsense reasoning in current AI systems.
2018	Jericho	A lightweight python-based interface connecting learning agents with interactive fiction games	N/A		Summary Jericho is a Python-based interface designed to connect learning agents with interactive fiction games, predominantly running on Linux and requiring Python 3, Spacy, and standard build tools like gcc. It streamlines the installation process either via PyPi or directly from GitHub. Key functionalities include a Frotz Environment, Object Tree, and Game Dictionary, enhancing the interaction with text-based games. Jericho facilitates actions and observations through string-based inputs in a reinforcement learning framework, supports game state saving and loading, and provides tools for accessing valid actions and game walkthroughs. Catering to diverse agent models like RCDQN, CALM, Q*BERT, and KG-A2C, Jericho stands as a versatile platform for research and development in text-based adventure games using reinforcement learning, encouraging community contributions while adhering to ethical open-source practices.
2018	ProPara	Everything Happens for a Reason: Discovering the Purpose of Actions in Procedural Text			Summary "Everything Happens for a Reason" introduces XPAD, a model designed to enhance the comprehension of procedural text, such as scientific processes or recipes, by not only predicting actions but also explaining their dependencies. The model extends the ProPara dataset with a focus on identifying why certain actions are necessary before others, based on the world state created by previous actions. XPAD significantly outperforms previous systems in explaining action dependencies and maintains performance on original tasks. The approach involves integrating background knowledge for more plausible effect predictions, potentially applicable to a wide range of texts involving procedural actions and state changes.
2017	DARPA AIDA	Active Interpretation of Disparate Alternatives			Summary DARPA’s Active Interpretation of Disparate Alternatives (AIDA) program aims to develop a semantic engine to interpret complex events, situations, or trends from diverse, unstructured sources. Targeting revolutionary advancements in information interpretation, AIDA covers five technical areas: Semantic Mapping, Common Semantic Representation, Multiple Hypotheses, Integration, and Data. The program operates in three 18-month phases, emphasizing innovative research to create and aggregate structured information into a common semantic space. Managed through a scientific review process, AIDA evaluates proposals for their potential to offer substantial advancements, with the National Institute of Standards and Technology (NIST) handling program evaluations. AIDA offers multiple awards, including contracts and cooperative agreements, based on the quality of proposals.
2016	ConceptNet (5.5)	An Open Multilingual Graph of General Knowledge		,	Summary ConceptNet 5.5 is a versatile semantic graph that connects words and phrases in natural language through labelled edges, built from diverse sources like Wiktionary and OpenCyc. It represents terms in a standardized form and integrates knowledge from external databases. Significantly, ConceptNet is employed in enhancing word embeddings, merging distributional semantics from sources like word2vec with its semantic network to form the ConceptNet Numberbatch. This hybrid semantic space has shown excellent performance in word-relatedness evaluations and solving SAT-style analogies, demonstrating a high correlation with human judgment. The project's resources, including code and data, are publicly available on GitHub.
2014	WikiData	A free collaborative knowledgebase			Summary Wikidata serves as a multilingual knowledgebase, offering a unified source of structured data for Wikipedia and external applications. It allows public editing, leveraging a community-driven approach to manage and evolve its data schema. Using property-value pairs for structuring information, Wikidata encompasses diverse knowledge from primary sources, with ample reference details. It integrates with external databases, providing comprehensive data accessibility through web services in various formats. Its continuous evolution, driven by the community, fuels its application in numerous domains. All data in Wikidata is released under the Creative Commons CC0 license, ensuring free and widespread accessibility.
2012	Google KG	The Google Knowledge Graph		N/A	Summary The Google Knowledge Graph is a knowledge base integrated into Google's search engine. Launched in May 2012, it presents information in an infobox next to search results, offering instant answers. Initially focusing on entities like people, places, and businesses, the Knowledge Graph rapidly expanded, growing from 570 million entities in its first seven months to 70 billion facts by mid-2016, and 500 billion facts on 5 billion entities by May 2020. The graph sources information from various outlets, including Wikipedia and the CIA World Factbook, and also feeds into Google Assistant and Google Home voice queries. While enhancing Google searches, it has been criticized for lacking source citations and contributing to a decline in Wikipedia article readership. Moreover, it's been scrutinized for presenting biased or inaccurate information due to its automated information-gathering method. Despite these criticisms, the Knowledge Graph has become an integral part of Google's search infrastructure.
2010	PRISMATIC	Inducing knowledge from a large scale lexicalized relation resource		N/A	Summary PRISMATIC, developed by IBM's Watson Research Lab, is an expansive lexicalized relation resource created from over 30 GB of text. It contains approximately 1 billion frames, each representing a semantic unit of relations in textual data. PRISMATIC is distinctive in its automatic creation and broad scope, focusing on detailed knowledge about predicates. The use of frame cuts allows for dissecting frames to induce knowledge patterns and make fine-grained type inferences. Its versatility lies in potential applications across various AI fields such as type inference, relation extraction, and textual entailment. PRISMATIC stands out for its automated approach to knowledge induction, offering substantial utility for future AI research.
2008	Chambers2008	Unsupervised Learning of Narrative Event Chains			Summary Chambers and Jurafsky's paper introduces narrative event chains, a novel approach in NLP for extracting structured knowledge from texts. Unlike traditional hand-coded scripts, these chains are derived using unsupervised learning methods from raw text, focusing on events connected through a common protagonist. The process involves three steps: learning narrative relations between events, temporally ordering these events, and pruning them to form coherent narrative chains. The paper introduces the narrative cloze task as a new evaluation method for assessing event relatedness within these chains. A key innovation is the use of unsupervised distributional methods to establish narrative relations, emphasizing the role of protagonists in narrative coherence. The results demonstrate significant improvements over baseline methods in both narrative prediction and temporal coherence, indicating the potential of this approach in enhancing understanding of narrative structures in text.
2007	GeoQuery	A bridge between the Gene Expression Omnibus (GEO) and BioConductor			Summary GEOquery serves as a vital bridge between the Gene Expression Omnibus (GEO) and BioConductor, offering an efficient tool for accessing and analyzing a vast array of gene expression experiments. Designed to integrate with R's statistical programming environment, it automates the parsing of GEO's data formats, easing the analysis process. With the creation of custom data structures, GEOquery simplifies data retrieval through a single command, enhancing the utility for bioinformatics research. Its ability to convert GEO data into various BioConductor structures widens its application in genomics data analysis. As part of the Bioconductor project, GEOquery stands out for its user-friendly design and broad accessibility.
1975	Schank1975	SCRIPTS, PLANS, AND KNOWLEDGE		N/A	Summary The paper by Schank and Abelson introduces 'scripts' as a theoretical framework in AI to understand common situations through predetermined, stereotypical sequences of actions. These scripts, composed of interlinked slots that influence each other, handle everyday scenarios and are distinct from 'plans' that deal with novel situations. The SAM program, which applies scripts for language understanding, demonstrates this concept. Plans, on the other hand, describe deliberate behavior and choices for achieving goals, crucial for making sense of language. The authors emphasize the role of 'forgetting' in effectively processing and remembering stories, suggesting that understanding involves filtering out unimportant details while retaining crucial information. This approach underlines the necessity of structured knowledge and 'forgetting heuristics' in AI for simulating human-like understanding and memory retention.

Cognitive Architectures and Generative Models

Publication Year	Name	Description	GitHub Link	Summary
2024	LLMs in ACT-R/Soar	Comparing LLMs for enhanced ACT-R and Soar Model Development.	N/A	Summary This study explores using ChatGPT4 and Google Bard to develop models for ACT-R and Soar in cognitive simulation. It involves two tasks: creating an ACT-R based driving simulation and using Soar to identify students' dominant intelligence types. The approach iteratively refines prompts to emulate cognitive actions and operations. The study documents challenges in integrating LLMs, offering solutions to improve model development. A significant contribution is the framework for prompt patterns, enhancing LLM interaction with cognitive architectures. The research highlights the potential of LLMs as interactive interfaces in cognitive modeling, paving the way for future explorations in the field.
2024	LLMs for Cognitive Agents	Exploiting Language Models as knowledge sources for cognitive agents.	N/A	Summary The study explores integrating Large Language Models (LLMs) with cognitive architectures to provide cognitive agents with a rich source of task knowledge. It addresses the challenge of scaling cognitive agents to complex tasks, which hinges on their ability to acquire new knowledge. LLMs offer vast potential knowledge but face issues of reliability and trustworthiness. The paper suggests three integration patterns: indirect extraction, direct extraction, and direct knowledge encoding. Focusing on direct extraction, it outlines a method for agents to interact with LLMs to extract relevant task knowledge. This includes identifying knowledge gaps, formulating queries, interpreting responses, and verifying the extracted knowledge's applicability and accuracy.
2024	LLM Cog Companion	Using Large Language Models in the Companion Cognitive Architecture: A Case Study and Future Prospects	N/A	Summary The paper explores integrating large language models (LLMs) like BERT into the Companion cognitive architecture to enhance its natural language capabilities. The Companion architecture, which features a rule-based semantic parser CNLU, aims to create human-like software social organisms capable of learning, reasoning, and interacting through natural language. BERT is used for improving CNLU's disambiguation capabilities and augmenting knowledge extraction in reading tasks. The research also discusses using LLMs for low-resource disambiguation, simplifying text inputs for better processing, and directly generating predicate calculus from text. These integrations present unique challenges and opportunities, emphasizing the potential of combining LLMs with knowledge-rich cognitive architectures.
2024	Chunks in Cognitive Arch	Method for generating knowledge chunks in cognitive architectures.		Summary This study presents a solution to the knowledge engineering bottleneck in cognitive architectures like ACT-R. It proposes using artificial intelligence methods, specifically natural language processing, to extract key entities, relationships, and attributes from unstructured text. These elements are then structured into triples or chunks, which can be used to augment the knowledge base of cognitive models. The primary application is in enhancing analogical reasoning within cognitive architectures. By automating the generation of knowledge chunks, this approach aims to reduce the time-intensive and expert-dependent process of knowledge engineering, enhancing the capabilities and efficiency of cognitive models in tasks like analogical reasoning.
2024	Generative Models in CA	On Using Generative Models in a Cognitive Architecture for Embodied Agents.	N/A	Summary The paper discusses the integration of generative AI models, particularly Large Language Models like ChatGPT, into cognitive architectures, using the ICARUS framework as a reference point. It explores how generative models can enhance the capabilities of embodied agents in various cognitive tasks. The paper proposes using language as an intermediate medium for tasks beyond communication, including arithmetic, logical reasoning, and task planning. It highlights the potential use of these models in perception, inference, goal reasoning, and planning within the ICARUS architecture. Additionally, it addresses the challenge of LLMs' non-persistent learning by integrating them into a cognitive framework with long-term memory structures.
2024	CoA-Reasoning	Efficient Tool Use with Chain-of-Abstraction Reasoning	N/A	Summary The paper introduces Chain-of-Abstraction Reasoning (CoA) as a method to enhance large language models (LLMs) for efficient multi-step reasoning. CoA involves fine-tuning LLMs to create reasoning chains with abstract placeholders, which are later filled with specific knowledge via domain tools. This method improves LLMs' reasoning robustness and reduces reliance on explicit domain knowledge, allowing for parallel decoding and tool usage. Tested on mathematical reasoning and Wikipedia QA tasks, CoA demonstrates superior QA accuracy and efficient tool usage compared to existing methods. Its versatility indicates potential applicability across various reasoning domains and LLM decoding strategies.
2024	VSA	Bridging Cognitive Architectures and Generative Models with Vector Symbolic Algebras.	N/A	Summary The paper focuses on Vector Symbolic Algebras (VSAs) as a means to unify cognitive architectures with generative models. It presents VSAs as a comprehensive framework that can integrate symbolic and non-symbolic data, which is essential for cognitive models and neural networks. The paper explores the role of VSAs in creating kernel functions for data representation and views VSA-based memories as probabilistic distribution models. The study highlights the potential of VSAs in offering a unified approach to understanding cognition, brain activity, and uncertainty representations, but also acknowledges the challenges in resource management and the accuracy of high-dimensional vector representations. Future research directions include improving representation design and sampling techniques in VSAs.
2024	ML & CS	Building Intelligent Systems by Combining Machine Learning and Automated Commonsense Reasoning.		Summary The paper introduces a novel approach to building intelligent systems that closely emulate human cognitive processes by combining machine learning, particularly generative AI systems, for knowledge extraction from various sources, and automated commonsense reasoning. The extracted knowledge is translated into predefined predicates, and then reasoned over using the s(CASP) system, paralleling human's Kahneman's System 1 and System 2 thinking. This method has been applied to create systems for tasks like visual question answering, interactive chatbots, and autonomous driving, which not only respond interactively but also provide consistent, explainable outcomes. This approach ensures more reliable, human-like decision-making in AI systems, with applications demonstrated through a concierge bot example.
2024	LLM & CA	Augmenting Cognitive Architectures with Large Language Models.	N/A	Summary This paper discusses augmenting cognitive architectures like Soar and Sigma with generative AI models, particularly Large Language Models (LLMs). It proposes integrating LLMs as a part of the cognitive architecture's declarative memory, which can be prompted to extract knowledge relevant to specific tasks. The integration approach leverages the strengths of both cognitive architectures and LLMs, aiming to create an agent that surpasses the capabilities of either approach used in isolation. The method involves using impasse-driven architecture to prompt LLMs and learn task-specific operator embeddings, thereby enhancing cognitive systems with the vast knowledge and processing abilities of LLMs. Future work includes evaluating this approach in interactive task learning and extending the Sigma architecture to incorporate these new functionalities.
2024	LLM-CA	A Proposal for a Language Model Based Cognitive Architecture.	N/A	Summary Language Models (LLMs) by integrating them with cognitive architecture elements to address limitations in multi-step reasoning and compositionality. Drawing from dual-processing theory, the LMCA seeks to imbue LLMs with "slow thinking" abilities, characteristic of human cognitive effort and symbolic reasoning. Key components of LMCA include working memory with specialized buffers, a retrieval module, and distinct long-term memory modules for memory, thought, and action processing. Training LMCA presents challenges in data acquisition and necessitates mechanisms for continual learning and internal error generation. The proposal calls for software implementation and testing in real-world scenarios.
2024	CA & Trans	Proposal for Cognitive Architecture and Transformer Integration.		Summary This proposal explores integrating Transformers with a cognitive architecture like Soar for real-time, online learning from an agent's experiences. Traditional Transformers, such as LLMs, are trained offline and lack essential cognitive agent capabilities, including decision-making and reasoning. The integration aims to enable the Transformer to learn from the agent’s experiences and predict future events, thoughts, or actions. Key challenges include adapting Transformers for online learning, ensuring learned knowledge is grounded in the agent's experiences, and maintaining real-time responsiveness. Successful integration could endow agents with enhanced cognitive abilities like anomaly detection, action and world modeling, and anticipatory capabilities for complex environments.
2024	Minds&Mach	Combining Minds and Machines: Fusion of Cognitive Architectures and Generative Models.	N/A	Summary This paper investigates the fusion of cognitive architectures (CAs) and generative models to create general embodied intelligence. CAs, focusing on human-like cognitive processes, provide structured decision-making frameworks, whereas generative models excel in creative content generation. Despite their strengths, CAs face challenges in modeling complex cognitive phenomena, and generative models often struggle with coherence and contextual relevance. The integration of these approaches aims to leverage the interpretability and reasoning of CAs with the creative capabilities of generative models, enhancing the overall capabilities of intelligent agents. The paper explores various integration strategies, including hybrid models, and provides a case study in a nuclear power plant, demonstrating enhanced cognitive robots and digital humans.
2024	EmbodGen	Growing an Embodied Generative Cognitive Agent.		Summary This paper presents an approach to developing an embodied generative cognitive agent by integrating large language models (LLMs) with embodied cognitive architectures. Drawing from an evolutionary perspective, it posits that all goals, including those of cognitive agents, are fundamentally physiological. The proposed model emphasizes that object properties and categories are not intrinsic but constructed in relation to an agent’s goals. The paper argues that current LLMs, while generative at a behavioral level, require deeper cognitive integration. It suggests a model where perception and concept formation are goal-driven, advocating for a predictive coding approach across different levels of cognition in the agent’s architecture.
2024	The Grounding Problem	The Grounding Problem: Integration of Cognitive and Generative Models.	N/A	Summary This paper discusses the integration of cognitive and neural AI paradigms, focusing on the grounding problem as the key challenge. Grounding involves how AI systems develop meaningful semantics from representations without direct interaction with the world. The paper identifies five types of grounding (sensorimotor, communicative, epistemic, relational, and referential) essential for AI systems. By addressing the grounding problem, the authors propose that integrating cognitive models with connectionist approaches can overcome limitations of current AI systems. This integration, which encompasses social and ethical dimensions, is exemplified in computational creativity and educational applications. The approach highlights the importance of aligning AI systems with human values and societal expectations.
2024	GenInstLearn	Generative Environment-Representation Instance-Based Learning: A Cognitive Model.		Summary The Generative Environment-Representation Instance-Based Learning (GERIBL) model integrates generative models (GMs) with Instance-Based Learning Theory (IBLT) to enhance learning in dynamic decision-making tasks. GERIBL uses Artificial Neural Networks, specifically generative models like AutoEncoders and Generative Adversarial Networks, to automatically generate representations of complex tasks. This integration provides a new approach to forming task-relevant environment features and similarity metrics. The model was evaluated through experiments on visual utility learning and transfer of learning, demonstrating its ability to emulate human-like performance. Smaller representation sizes in GMs generally yielded better results, highlighting the model's potential in cognitive architecture and decision-making research.
2024	LLM-IBL	Exploring Instructions to Rewards with LLMs in Instance-Based Learning.	N/A	Summary This study proposes using Large Language Models (LLMs) to enhance instance-based learning by incorporating descriptive information into experiential learning. The approach involves prompting LLMs with task instructions to define critical actions and assign values to these actions for successful task completion. In an initial experiment involving a grid-world task, this method significantly improved the learning of a cognitive model. The study addresses the challenge of temporal credit assignment by using LLM-interpreted instructions to provide dense reward signals, guiding the learning process. This approach demonstrates the potential of LLMs to enrich cognitive models with descriptive information, facilitating more efficient and effective learning.
2024	Psycho-Gen-Agents	Psychologically-Valid Generative Agents in Agent-Based Modeling.	N/A	Summary The paper introduces Psychologically-Valid Generative Agents (PVGAs), a new framework for agent-based modelling in social sciences. It combines cognitive architectures, large language models (LLMs), and stance detection to simulate realistic human behaviours. This approach enables agents to make data-driven, cognitively-constrained decisions and generate human-like linguistic data. The framework builds on prior work with Psychologically Valid Agents (PVAs) within the ACT-R architecture, particularly in modelling human behaviour during the COVID-19 pandemic. By integrating agent-based modelling with generative AI and stance detection, PVGAs offer a robust method for understanding complex social phenomena and individual behaviours, enhancing research in public health, social sciences, and other domains.
2024	CG-AI	Cognitive Architecture for Common Ground Sharing in Model-Model Interaction.	N/A	Summary This study focuses on developing a model to understand and improve common grounding between humans and generative AIs. It utilizes the Tangram Naming Task (TNT) as a testbed for examining the process of building a common cognitive framework essential for effective communication. The methodology includes generative AI models that simulate the internal processes of communication, where one model generates descriptions of abstract figures and another interprets these descriptions. Preliminary results show task performance exceeding chance levels, suggesting the successful implementation of a common cognitive framework. The study aims to refine these models further, paving the way for enhanced human-AI interaction and communication in the future.
2024	Neuro-Cog	High-Level Machine Reasoning with Cognitive Neuro-Symbolic Systems.	N/A	Summary This paper proposes a method to enhance AI systems' reasoning capabilities by integrating cognitive architectures with neuro-symbolic components, focusing on high-level reasoning akin to human common sense. Addressing the limitations in large language models (LLMs) and autonomous systems, the authors suggest a hybrid framework centered on the ACT-R cognitive architecture. This integration aims to bring structured knowledge and advanced reasoning to AI systems. The paper also discusses the evolving role of generative AI in cognitive systems and suggests future directions, including scaling cognitive models using LLMs and improving LLMs through cognitive model-based prompt engineering, offering a path toward more sophisticated AI reasoning capabilities.
2024	Auto-Know	Automating Knowledge Acquisition with LLMs for Cognitive Agents.		Summary This paper describes an experiment using large language models (LLMs) to automate the learning of new entries in a cognitive agent's semantic lexicon. The approach is part of content-centric computational cognitive modeling, which relies heavily on extensive knowledge resources. The experiment aims to expand the semantic lexicon by learning expressions equivalent to transitive verbs. It employs a five-step process utilizing LLMs, including generating synonymous multiword expressions (MWEs). An innovative prompting architecture and a chain-of-thought approach guide the LLMs to produce relevant outputs. The experiment's success demonstrates the potential for integrating LLMs in automated knowledge acquisition for cognitive agents.
2024	SecurSoar	Proposed Uses of Generative AI in a Cybersecurity-Focused Soar Agent.		Summary Argonne National Laboratory's project on autonomous intelligent cybersecurity agents (AICAs) aims to counter advanced cybersecurity threats using cognitive architecture, specifically Soar. The project addresses the rising use of AI in cybersecurity, both defensively and maliciously. However, the current use of Soar is limited in handling novel situations due to a lack of modern AI principles for dynamic analysis. The proposed solution involves integrating generative AI with Soar to enhance its capabilities in contextual understanding and learning. The architecture focuses on protecting critical infrastructures and employs a centralized system for threat sharing and decision-making, leveraging both symbolic and generative AI to process diverse cybersecurity data efficiently.
2024	LeverageConf	Leveraging Conflict to Bridge Cognitive Reasoning and Generative Algorithms.	N/A	Summary This position paper proposes a novel framework to address the challenges autonomous agents face in non-stationary environments. It integrates cognitive reasoning with generative algorithms, leveraging metacognitive conflict resolution to adapt to unexpected dynamics. The framework builds on the Common Model of Cognition, focusing on conflict detection as a trigger for agents to refine their cognitive strategies and update policies. It incorporates metalevel control, addressing resource constraints and utilizing generative models for prediction and perception. The aim is to seamlessly bridge low-level perceptual processes with higher-order cognitive functions, enabling agents to operate effectively in dynamic and unpredictable environments.
2024	Synerg-LLM	Integrating LLMs and Cognitive Architectures for Robust AI.		Summary This paper investigates the synergistic integration of Large Language Models (LLMs) and Cognitive Architectures (CAs) to enhance robust AI systems. It presents three distinct integration approaches: Modular, Agency, and Neuro-Symbolic. The Modular approach varies the degree of integration, using augmented LLMs and chain-of-thought prompting. The Agency approach, influenced by Society of Mind theory, involves micro and macro levels of cognitive interaction, while the Neuro-Symbolic approach, inspired by the CLARION architecture, combines bottom-up and top-down learning. This exploration aims to harness the complementary strengths of LLMs and CAs, addressing each field's limitations to advance AI development.
2023	CoALA	Cognitive Architectures for Language Agents		Summary The paper discusses the integration of large language models (LLMs) and production systems in developing language agents, proposing the Cognitive Architectures for Language Agents (CoALA) framework. CoALA organizes language agents with modular memory components, structured action spaces, and decision-making processes, drawing inspiration from the historical evolution of cognitive science and artificial intelligence. The framework views LLMs as probabilistic production systems, using prompt engineering as a control mechanism. It aims to provide a structured approach to creating more sophisticated language agents capable of reasoning, planning, and memory management. This approach seeks to bridge traditional AI methodologies with modern generative models, paving the way for language-based general intelligence.
2023	Symbolic LLM's	Synergistic Integration of Large Language Models and Cognitive Architectures for Robust AI: An Exploratory Analysis	N/A	Summary This paper explores the integration of Large Language Models (LLMs) and Cognitive Architectures (CAs) for developing more robust AI systems. It presents three integration approaches: Modular, Agency, and Neuro-Symbolic. The Modular approach varies integration degrees, employing chain-of-thought prompting and augmented LLMs. The Agency approach, inspired by the Society of Mind theory, involves interactions of agents at micro and macro levels. The Neuro-Symbolic approach, based on the CLARION architecture, combines bottom-up and top-down learning processes. These integrations aim to leverage the strengths of both LLMs and CAs while mitigating their individual limitations, proposing novel architectures for AI advancement.
2022	ACT-R VS Soar	An Analysis and Comparison of ACT-R and Soar	N/A	Summary This paper provides a detailed analysis and comparison of two cognitive architectures: ACT-R and Soar. It delves into their overall structures, representations of agent data and metadata, and associated processing, focusing on working memory, procedural memory, and long-term declarative memory. The paper highlights the commonalities and differences between these two architectures, which are shaped by their primary goals: cognitive modeling for ACT-R and development of general AI agents for Soar. It identifies the processes and distinct classes of information used by these architectures, exploring the roles of metadata in decision making, memory retrievals, and learning. The analysis contributes to a deeper understanding of these cognitive models and their potential applications.
2022	ROSIE (from SOAR)	Rosie (RObotic Soar Instructable Entity) is an agent written in the Soar Cognitive Architecture		Summary Rosie, the Robotic Soar Instructable Entity, is a project at the University of Michigan Soar Lab. It's an agent written in the Soar Cognitive Architecture that learns tasks through situated interactive instruction, showcasing capabilities in the area of Interactive Task Learning (ITL). Rosie stands out for its ability to learn entirely new tasks and concepts from just one example and apply this knowledge in varied scenarios. This agent, led by Professor John E. Laird and a team, can detect gaps in its knowledge and actively seek information or initiate interactions to fill these gaps. Rosie has been employed to learn games, puzzles, household tasks, and more, demonstrating its adaptability in real-time and interactive learning environments.
2018-22	SOAR	Soar Cognitive Architecture		Summary Soar 9.6.2, the latest version of a cognitive architecture for intelligent behavior development, is now available for download. It includes enhancements to the debugger, CLI, and VisualSoar, along with bug fixes. The 43rd Soar Workshop will be hosted by the University of Michigan in 2023, featuring beginner and advanced tutorials. Recent achievements include Mininger and Laird winning Best Demonstration at the 2022 AAAI Conference. VISCA-2021 was hosted by the same group, focusing on various aspects of cognitive architecture. Significant contributors, Laird and Rosenbloom, received the 2018 Herbert A. Simon Prize for their work on Soar. The Soar platform is also utilized in unique installations like LuminAI and autonomous underwater vehicle IVER. Recent publications highlight Soar's diverse applications in AI and cognitive science. Soar, evolving since 1983, aims to address a wide range of intelligent agent tasks and incorporates multiple forms of knowledge and problem-solving methods. It seeks to approximate complete rationality by combining relevant knowledge for decision-making and is moving towards multiple learning mechanisms and representations of long-term knowledge for better functionality and performance.
1989	Structure Mapping Engine	A computational model for analogical reasoning based on psychological theory.	N/A	Summary Overview of SME's role in problem-solving and concept comprehension through analogy. The Structure-Mapping Engine (SME) is a program that facilitates the study of analogical processing, based on the structure-mapping theory of analogy. It is designed to be flexible and efficient, making it useful in both cognitive simulation studies and machine learning. SME operates by constructing consistent interpretations of potential analogies, following a three-stage process that includes access, mapping and inference, and evaluation. The program adheres to constraints of structural consistency and one-to-one mapping, supported by empirical psychological evidence. SME's algorithm involves several steps, including match construction and evaluation, and has been applied in various domains including psychological studies and concept learning.

Common Model of Cognition

Publication Year	Name	Description	GitHub Link	Summary
2024	Gen-CMC	Opportunities and Challenges in Applying Generative Methods to Exploring and Validating the Common Model of cognition.	N/A	Summary This paper focuses on applying generative methods like Dynamic Causal Modeling (DCM) to explore and validate the Common Model of Cognition (CMC). As CMC expands, its complexity increases, presenting challenges in validating and comparing intricate network structures. Alternative methods such as regression DCM (rDCM) and Biophysical Network Modeling (BNM) are considered for handling these complexities. However, a significant challenge remains in reliably comparing models of varying complexities. Tools like sparse rDCM may assist in identifying plausible connections within complex networks, but there's still a need to develop robust comparison methods to evaluate and validate these expanded cognitive models effectively.
2024	COG-GEN	Neuro-Mimetic Realization of the Common Model of Cognition via Hebbian Learning and Free Energy Minimization.		Summary Introduces COGnitive Neural GENerative system for representing the Common Model of Cognition using Hebbian learning and free energy principle. CogNGen is a pioneering cognitive architecture that integrates the Common Model of Cognition with contemporary neural generative models. It employs neurobiologically plausible elements, such as neural generative coding and vector-symbolic memory models, adhering to predictive processing principles. The architecture optimizes a variational free energy functional, balancing model complexity and accuracy. It features sophisticated memory systems, including working and long-term memory modeled on MINERVA 2 and Hopfield networks. CogNGen's effectiveness in complex maze-solving tasks demonstrates its potential. Future enhancements aim to scale up the architecture for more challenging tasks, refine memory systems, and advance perceptual and motor modules.
2024	Cog-FM	Integrating Cognitive Architectures with Foundation Models.	N/A	Summary Discusses integrating cognitive architectures with foundation models for few-shot learning. The paper proposes an integration of cognitive architectures and foundation models to support trusted artificial intelligence. It aims to use cognitive architectures for their strengths in human-like few-shot learning and managing the vast data processed by foundation models, which often suffer from inaccuracies and hallucinations. Trust in AI is a central focus, emphasizing the difference between AI trustworthiness and human trust. The potential synergy between these systems could be realized by using foundation models as knowledge repositories for cognitive architectures, thereby addressing scalability and memory issues. Future research is directed towards developing neuro-symbolic AI and context-aware models for more reliable and trustworthy AI applications.
2024	BridgeGen	Bridging Generative Networks with the Common Model of Cognition.	N/A	Summary Presents a framework for adapting the CMC to large generative network models in AI. This article presents a theoretical framework for adapting the Common Model of Cognition (CMC) to large generative network models within AI. It suggests restructuring CMC modules into shadow production systems that are peripheral to a central production system, which handles higher-level reasoning based on the output of shadow productions. This integration of symbolic reasoning and connectionist statistical learning aims to enhance CMC models with advanced cognitive abilities. The proposed reformulation, particularly of the ACT-R module conception, involves incorporating generative pre-trained networks. Central elements like procedural memory and working memory are maintained, while chunks and buffers facilitate communication and data processing within the architecture.
2020	Predictive Processing	Predictive Processing: The Grand Unifying Theory of the Brain	N/A	Summary The "Predictive Processing" paper explores the brain's function as a prediction machine, particularly in perception, language, and learning. It draws an analogy between cognitive processing and a baseball game, showing how split-second decisions are made based on predictive models. The paper applies this concept to language, suggesting that our brains predict linguistic elements, like syntax and phonology, in real-time. It describes a continuous cognitive loop during conversations, where the brain forms predictions, prepares responses, compares these to sensory input, and adjusts output accordingly. The paper discusses how encountering unexpected or unfamiliar language inputs leads to prediction errors, identified as N400 event-related potentials. This aspect is particularly relevant in language learning, where developing predictive models is essential for efficient communication. The challenge for language learners lies in adjusting to unfamiliar words or usage, similar to handling unexpected scenarios. Overall, the paper highlights the significance of predictive processing in managing cognitive load and its potential application in language teaching methodologies.
2011	Brain Statistical Inference	The Brain as a Statistical Inference Engine - and You Can Too	N/A	Summary The paper discusses the concept of the brain as a statistical inference engine, focusing on its capability to process and interpret information based on statistical patterns. Key insights include the use of statistical regularities by infants in lexical acquisition from unsegmented speech streams, and the brain's reliance on statistical data for visual perception. The paper emphasizes the critical role of Bayes’ Law in the brain's processing mechanism, allowing for a balance between prior knowledge and sensory input. It contrasts generative models, which are akin to the brain's method of understanding, with discriminative models, noting the advantages and limitations of each. The discussion extends to the importance of joint generative modeling in language and vision, the role of informative priors in language acquisition tasks like word segmentation, and the use of statistical models in image segmentation. The paper also explores various inference mechanisms and concludes by highlighting the significant contributions of statistical computational linguistics to cognitive science.

Memory AI major projects

Publication Year	Name	Description	Paper Link	GitHub Link	Summary
2022	Learn2Expire	Learning to Expire: Synthesizing Lifelike Influence Expiry Times for Transformer Recommendations			Summary The paper introduces Expire-Span, a novel approach for efficient long-term memory management in sequence modelling. By assigning an expiration value to each memory state, the model selectively retains important information while expiring irrelevant data. This method is integrated into Transformer architectures, significantly enhancing their capacity to handle longer memory spans. Experimental results demonstrate its effectiveness in various tasks, including natural language processing and reinforcement learning. Expire-Span achieves state-of-the-art results, efficiently processing memory sizes up to tens of thousands. Its scalability and efficiency offer significant potential for complex applications requiring extensive and dynamic memory management.
2024	LLM-Memory	Memory Matters: The Need to Improve Long-Term Memory in LLM-Agents.			Summary Reviews current efforts in developing LLM agents with a focus on improving long-term memory using vector databases. Highlights challenges and proposes future research topics. The paper reviews the development of autonomous agents using large language models (LLMs) and focuses on enhancing their long-term memory. It discusses how vector databases are pivotal for storing and retrieving long-term memory in LLM agents. The paper identifies challenges in memory management, such as differentiating types of memories and managing lifelong memories. Future research areas include improving metadata use in memory and integrating external knowledge sources. The document also highlights existing LLM agents like Auto-GPT and Voyager, detailing their functionalities and limitations. Finally, it proposes aligning LLM agents with cognitive architectures to effectively manage procedural, semantic, and episodic memories.

Meta-level control major projects

Publication Year	Name	Description	Paper Link	Summary
2023	CORA	Elemental AI Cognition product	N/A	Summary Cora enables researchers to rapidly explore open-ended, complex questions and discover insights and answers grounded in factual, trustworthy, and contextualized evidence. Cora’s AI guides researchers to discover plausible answers that connect evidence across, and not just within, sources.
2023	MIT COGENT	Elemental AI Cognition product		Summary The MIT Encyclopedia of the Cognitive Sciences (MITECS), edited by Robert A. Wilson and Frank C. Keil and published by The MIT Press, is a comprehensive reference work encapsulating the diversity of methodologies and theories in the cognitive sciences. Since the 1970s, cognitive sciences have evolved multidisciplinarily, shaping our understanding of the mind and cognition. MITECS offers 471 concise entries on key concepts from Acquisition to Wundt and X-bar Theory, each written by a leading researcher. It includes six extended essays overviewing major areas of cognitive science: Philosophy, Psychology, Neurosciences, Computational Intelligence, Linguistics and Language, and Culture, Cognition, and Evolution. This encyclopedia, valuable for students and researchers, serves as a guide to the current state of cognitive sciences, with each article providing accessible introductions and references for further reading.
2022	BRAID	Weaving Symbolic and Neural Knowledge into Coherent Logical Explanations		Summary Braid is a novel logical reasoner that blends symbolic reasoning with statistical methods, addressing limitations of traditional symbolic engines like brittle inference and knowledge gaps. It introduces custom unifiers and dynamic rule generation to allow flexible term matching and hypothesize rules from a knowledge base. Braid's distributed, task-based framework efficiently constructs proof graphs, making it scalable. The reasoner demonstrates its efficacy in commonsense reasoning tasks like the ROC Story Cloze Test, achieving near state-of-the-art results with logical explanations. It incorporates frame-based approaches and integrates neural models for rule generation, showing adaptability and effectiveness in improving reasoning with user feedback.
2022	FactPEGASUS	Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization		Summary The document proposes a multimodal anomaly detection system for IoT security, leveraging transformer-based models to handle diverse data types effectively. It integrates information from different modalities using a fusion module and employs pre-training and fine-tuning for adaptation to specific IoT datasets. Experimental results demonstrate its effectiveness in detecting anomalies across various IoT data streams, surpassing baseline methods. The approach shows promise for enhancing security in real-world IoT applications like smart homes and industrial IoT. Future research directions include exploring advanced transformer architectures and addressing scalability and deployment challenges in IoT environments.
2022	FactGraph	FactGraph: Evaluating Factuality in Summarization with Semantic Graph Representations		Summary Factuality evaluation in summarization is crucial but challenging due to existing models' lack of consistency with source documents. FACTGRAPH addresses this by decomposing documents and summaries into structured meaning representations (MR) using semantic graphs. It employs a graph encoder augmented with structure-aware adapters to capture semantic relations comprehensively. By combining text and graph representations, FACTGRAPH achieves superior performance, outperforming previous methods by up to 15% in factuality assessment. It excels in detecting content verifiability errors and identifying subsentence-level factual inconsistencies, offering a promising avenue for enhancing summarization quality in real-world applications.
2022	PrOntoQa	A Systematic Formal Analysis of Chain-of-Thought		Summary existing benchmarks don't directly assess their reasoning process. To address this, PRONTOQA, a new dataset, is introduced, enabling systematic exploration of language models' reasoning abilities. Analysis on models like INSTRUCTGPT and GPT-3 reveals proficiency in individual deduction steps but struggles with proof planning when faced with multiple valid options. PRONTOQA facilitates easy analysis by converting CoT into symbolic proofs, allowing direct evaluation of reasoning. Results suggest pretraining significantly influences LLM reasoning, especially in fictional contexts. PRONTOQA aids in understanding LLMs' capabilities and limitations, crucial for future research and model development.
2021	SOFAI	Thinking Fast and Slow in AI: the Role of Metacognition		Summary Advancements in AI remain predominantly in narrow domains, lacking the adaptability and generalizability of human intelligence. Integrating insights from Kahneman's "Thinking, Fast and Slow," the SOFAI architecture blends fast (system 1) and slow (system 2) agents, mimicking human cognitive processes. A meta-cognitive agent arbitrates between these systems, assessing resource constraints, solver abilities, and past experiences to optimize decision-making. Employing a two-phase assessment process inspired by human introspection, SOFAI balances speed and accuracy. By default, it employs system 1 solvers, akin to human intuition, minimizing time-to-action. SOFAI instances are being tested in various real-life sequential decision scenarios to enhance AI performance and flexibility.
2021	SKATE	A Natural Language Interface for Encoding Structured Knowledge		Summary SKATE, a natural language interface, minimizes the disparity between user input and system comprehension by iteratively refining natural language through semi-structured templates. Leveraging a neural semantic parser, suggests templates filled recursively for fully structured interpretations. Integrated with a neural rule-generation model, it facilitates the interactive acquisition of commonsense knowledge. Initial assessments demonstrate SKATE's effectiveness in comprehending stories. The architecture entails concept recognition, interpreted template production, and user refinement. Utilizing semantic frames processed by downstream applications, SKATE comprises components such as target representation vocabulary, concept recognizer, and semantic converter. Its applications span story understanding tasks to specialized domains like COVID-19 policy design, showcasing versatility and adaptability.
2020	GLUCOSE	GeneraLized and COntextualized Story Explanations		Summary GLUCOSE introduces a dataset capturing implicit commonsense causal knowledge within narrative contexts. It encompasses ten dimensions of causal explanation, addressing events, states, motivations, emotions, and naive psychology. Specific causal statements paired with general inference rules facilitate AI understanding. Utilizing a crowdsourcing platform, over 670K annotations were gathered from lay workers, focusing on everyday children's stories. The dataset tackles the challenge of acquiring and integrating commonsense knowledge into AI systems. GLUCOSE explanations adopt semi-structured inference rules, balancing between free text and logical forms, promoting better generalization. By scaffolding cognitive development in AI systems, GLUCOSE aims to enhance causal reasoning and generalization capabilities. The dataset and models are released for the AI research community to advance commonsense reasoning in various applications.

Benchmarks

Publication Year	Name	Description	GitHub Link	Summary
2023	ComFact	A Benchmark for Linking Contextual Commonsense Knowledge		Summary The paper introduces ComFact, a benchmark for commonsense fact linking, addressing the challenge of retrieving relevant knowledge from KGs for NLP systems. It highlights significant performance gains (34.6% F1) over heuristic methods, emphasizing the importance of accurate fact retrieval. Downstream tasks like dialogue response generation benefit from improved knowledge retrieval (9.8% average improvement). However, models still fall short of human performance, indicating research opportunities in commonsense augmentation. Challenges such as contextual relevance and implicitness are identified, prompting the need for more sophisticated retrieval methods. The proposed task and benchmark aim to foster advancements in commonsense fact linking, setting the stage for future research in NLP.
2023	Crow	Benchmarking Commonsense Reasoning in Real-World Tasks		Summary The CROW benchmark innovatively evaluates commonsense reasoning in real-world tasks for NLP systems. Contrasting with artificial scenarios in existing datasets, CROW embeds commonsense-violating perturbations into six real-world tasks, including machine translation, open-domain dialogue, and safety detection. It employs a multi-stage data collection pipeline to rewrite examples from existing datasets, challenging models with nuanced reasoning tasks. The benchmark uses Macro-F1 and Situational Accuracy for evaluation, uncovering a significant gap between human performance and current NLP models. While CROW marks a significant advancement in commonsense reasoning benchmarks, it also acknowledges limitations like a narrow focus on commonsense dimensions and potential crowdsourcing biases.
2023	CaptionCon	Do Androids Laugh at Electric Sheep? - Humor "Understanding" Benchmarks from The New Yorker Caption Contest. A corpus for caption generation models		Summary
2022	Causal Relation Benchmark	Knowledge Graph Embeddings for Causal Relation Prediction		Summary This research investigates if AI models can grasp humor by using tasks based on the New Yorker Cartoon Caption Contest, which require understanding the sophisticated interplay between cartoon images, captions, and cultural allusions. The tasks involve matching jokes to cartoons, identifying winning captions, and explaining the humor. Despite employing advanced models like GPT-4, AI performance lagged behind humans, exposing a significant gap in AI's humor comprehension. The study utilizes a comprehensive dataset from 14 years of contests and includes rich human-authored annotations. The findings and resources are made public, contributing to future AI research in humor understanding. However, the research's focus on New Yorker humor might limit its applicability to broader humor types.
2022	Wikidata Causal Event Triple Data	Event Prediction using Case-Based Reasoning over Knowledge Graphs		Summary The research investigates the effectiveness of Knowledge Graph (KG) embeddings in predicting causal relations between news events, a task hindered by the sparsity of existing causal KGs like Wikidata. By evaluating five different KG embedding and GCN-based link prediction methods, the study seeks to understand their capabilities in causal relation prediction. Two new causal KG datasets, WikiCV and WikiMV, were created from Wikidata for this purpose. The evaluation included classical accuracy measures and a novel manual approach, revealing that no single method consistently outperformed others. The study concludes that while current methods show limitations, they offer valuable contributions, indicating potential areas for future advancements in causal relation prediction.
2022	Switchboard benchmark	Toward Zero Oracle Word Error Rate on the Switchboard Benchmark		Summary The study presents a detailed evaluation of the Switchboard benchmark in automatic speech recognition, highlighting key improvements. By employing a professional linguist to correct reference transcripts, the word error rate (WER) was substantially improved. The paper suggests alternatives to standard WER scoring, including transcript precision and recall, better reflecting human transcription tendencies. It examines various ASR alternative representations (utterance-level, word-level, and phrase-level) and showcases the effectiveness of phrase alternatives in achieving near-perfect oracle accuracy. The study also benchmarks commercial ASR systems against human services and underscores the potential of advanced ASR technologies in applications like audio search.
2022	BIG-BENCH dataset	Quantifying Language Models' Capabilities		Summary The BIG-bench benchmark evaluates language models' capabilities across a broad spectrum of tasks, covering linguistics, reasoning, and more.
2021	NLI QA dataset	Can NLI Models Verify QA Systems’ Predictions?		Summary The paper presents a method for using NLI to verify the accuracy of answers provided by QA systems, aiming to improve their reliability.
2020	PIQA	Reasoning about Physical Commonsense in Natural Language		Summary The paper introduces the Physical Interaction: Question Answering (PIQA) benchmark, designed to evaluate natural language understanding systems' ability to reason about physical commonsense. This initiative addresses a significant challenge for AI systems: reliably answering questions about physical interactions in the world without experiencing it. This challenge is likened to children's development, where they form concepts based on the physical properties of objects before learning language. The primary source of data for PIQA is instructables.com, a website with user-generated instructions for a variety of tasks, emphasizing the use of everyday objects in often unconventional ways. This choice ensures that the data reflects a wide range of physical interactions and practical knowledge. In PIQA, questions are formatted as goal and solution pairs, demanding an understanding of the practical, physical steps involved in completing tasks. A key aspect of PIQA is its adversarial filtering method, AFLite, which helps minimize biases in the dataset. This ensures that the AI models genuinely learn about physical properties and interactions, rather than exploiting dataset-specific patterns. Despite the advancements in AI, especially in large-scale pretrained models like GPT, BERT, and RoBERTa, there remains a significant performance gap when compared to human accuracy in PIQA. This gap highlights the difficulty AI systems have in understanding even basic physical concepts and properties. For instance, RoBERTa struggles with differentiating solutions based on simple relations such as 'before' and 'after', or comprehending the versatile properties of 'water'. The study's findings emphasize the limitations of learning about the physical world from language alone. It advocates for a more interactive approach to AI development, where systems learn from real-world experiences, mirroring the way humans acquire knowledge. This approach could lead to more robust AI systems capable of understanding and interacting with the physical world in a more nuanced and practical manner
2019	WinoGrande	WinoGrande: An Adversarial Winograd Schema Challenge at Scale		Summary The paper "WINOGRANDE: An Adversarial Winograd Schema Challenge at Scale" introduces WINOGRANDE, a large-scale dataset of 44k problems designed to test AI commonsense reasoning. The dataset builds on the original Winograd Schema Challenge (WSC), but aims to reduce biases and improve both scale and challenge. It involves a novel crowdsourcing approach for data collection and a systematic bias reduction algorithm, AFLITE, to address machine-detectable embedding associations. This helps create a more robust dataset challenging for AI models but trivial for humans. The paper also explores the effectiveness of WINOGRANDE in transfer learning, highlighting its potential overestimation of AI commonsense capabilities. The authors emphasize the importance of algorithmic bias reduction in benchmarks to avoid overestimating machine intelligence.
2019	LitePyramid	Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation		Summary The study introduces a crowdsourced, lightweight version of the Pyramid method for manual summary evaluation, addressing the traditional Pyramid's high cost and complexity. This new method employs a sampling-based approach to extract Summary Content Units (SCUs) from reference summaries and uses crowdsourcing for evaluating system summaries. Tested using the DUC 2005 and 2006 datasets, it demonstrates a higher correlation with original Pyramid scores than with the Responsiveness method, suggesting greater reliability. The study also explores the balance between resource allocation and evaluation quality, indicating potential for cost efficiency and adaptability. It paves the way for reliable, scalable summary evaluations in future research
2019	GLUE	General Language Understanding Evaluation (GLUE) benchmark		Summary The General Language Understanding Evaluation (GLUE) benchmark is a collection of tools for evaluating models on a variety of natural language understanding tasks. It features tasks like sentiment analysis and question answering, aiming to advance models that generalize across diverse linguistic tasks. GLUE emphasizes the importance of models sharing knowledge across tasks, particularly when training data is limited. Evaluation of baseline models, including ELMo, shows that multi-task training is more effective than single-task training. However, there is a significant need for improvement in overall model performance, especially in areas like logical reasoning, indicating fertile ground for future NLU research.
2019	CLUTRR (Meta)	A Diagnostic Benchmark for Inductive Reasoning from Text		Summary CLUTRR is a benchmark suite designed to test NLU systems' robustness and generalization capabilities using inductive reasoning on kinship relations in short stories. It combines semi-synthetic stories based on kinship graphs and crowd-sourced narratives, assessing systematic generalization on unseen logical combinations and robustness against noise. Comparative studies reveal performance gaps between traditional NLU models like BERT and graph-based models such as GAT, especially in reasoning with unstructured text. CLUTRR's diagnostic capabilities underscore the challenges in machine reasoning and point towards the need for more advanced, robust NLU systems capable of handling diverse linguistic tasks and generalizing systematically.
2019	Natural Questions	Benchmark for QA Research		Summary The paper presents the Natural Questions corpus, a large dataset aimed at improving open-domain question answering, featuring real user queries and annotations.
2018	FEVER	Large-scale Dataset for Fact Extraction and Verification		Summary The FEVER dataset aids in developing models for verifying textual claims against sources, categorized as SUPPORTED, REFUTED, or NOTENOUGHINFO.
2016	SQuAD	100,000+ Questions for Machine Comprehension of Text		Summary The Stanford Question Answering Dataset (SQuAD) is a large-scale dataset for advancing machine comprehension of text, comprising over 100,000 questions formulated by crowdworkers based on Wikipedia articles. Each question is designed to have answers that are segments from the text, presenting a significant challenge in natural language understanding and world knowledge. The dataset exhibits a diverse range of question types and answer formats, highlighting the complexity of the reading comprehension task. While human performance notably surpasses machine models on SQuAD, it underscores the potential for future research and development in this area, making it a valuable resource for the natural language processing and machine learning communities.
2014	SICK	A SICK cure for the evaluation of compositional distributional semantic models	N/A	Summary The document introduces the SICK dataset, specifically designed for evaluating Compositional Distributional Semantic Models (CDSMs). It addresses the need for a comprehensive benchmark focusing on sentence-level semantics in computational systems. SICK contains approximately 10,000 English sentence pairs, deliberately excluding complex elements like idiomatic expressions and named entities, to concentrate on lexical, syntactic, and semantic phenomena. Each sentence pair is annotated for semantic relatedness and entailment, employing crowdsourcing to ensure a broad range of inputs. The dataset originates from the ImageFlickr and SemEval 2012 STS MSR-Video Description datasets, undergoing a process of normalization and expansion to create varied linguistic pairs. This makes SICK a valuable resource for assessing the ability of CDSMs to handle complex sentence-level semantics. It was notably used in SemEval 2014 Task 1, which aimed at the semantic evaluation of computational models.
2012	Atari	The Arcade Learning Environment: An Evaluation Platform for General Agents		Summary competency of AI agents, offering an interface to a wide array of Atari 2600 games. Each game presents unique challenges, providing a comprehensive testbed for techniques in reinforcement learning, planning, and other AI domains. The ALE facilitates benchmarking against well-established AI methods, demonstrated through empirical evaluations across over 55 diverse games. All software and benchmark agents are made publicly available, supporting widespread research use. This initiative represents a significant step in AI research, pushing towards achieving general competency across a wide range of Atari 2600 games.
2012	MNIST	Database of Handwritten Digit Images for Machine Learning Research		Summary The MNIST database is a fundamental resource in machine learning and pattern recognition, consisting of 60,000 training and 10,000 test images of handwritten digits. Originating from the NIST database, it offers standardized, size-normalized, and centered images for algorithm benchmarking. Its widespread use allows for effective comparison of new machine learning algorithms against established ones, particularly highlighting the superiority of convolutional neural networks, especially when using data augmentation techniques like elastic distortion. MNIST serves as an accessible starting point for researchers and students, facilitating exploration of machine learning techniques with minimal preprocessing effort, analogous to the role of the TIMIT database in speech processing research.
2011	Winograd	The Winograd Schema Challenge		Summary This paper introduces the Winograd Schema Challenge as an alternative to the Turing Test for evaluating a machine's understanding of human language. A Winograd schema consists of a pair of sentences that differ in only a few words and contain an ambiguous pronoun whose reference is clear to a human reader but difficult for AI. The challenge is designed to avoid reliance on statistical methods or selectional restrictions by requiring understanding of the sentences' context and commonsense knowledge. The authors argue that this challenge is more precise in testing genuine language understanding and reasoning than the Turing Test. It aims to foster AI research that genuinely advances the field's understanding of language and reasoning, moving away from the deceiving complexities of the Turing Test. The Winograd Schema Challenge demands a nuanced understanding of language and world knowledge, aiming to test AI's understanding in a straightforward, clear-cut manner.
2009	ImageNet	Image database organized according to the WordNet hierarchy (currently only the nouns)		Summary ImageNet is a large-scale hierarchical image database built upon the WordNet structure, aiming to populate thousands of synsets with millions of high-quality, human-annotated images. The current state features 12 subtrees with 5247 synsets, totaling 3.2 million images. It surpasses other image datasets in scale, accuracy, and diversity. ImageNet is constructed using Amazon Mechanical Turk, ensuring precise and quality-controlled images. Its hierarchical structure and semantic organization offer unique advantages for various computer vision applications, including object recognition and image classification. Future goals include expanding the database to around 50 million images and fostering a collaborative community platform.
1996	FraCas	A Type-Theoretical system for the FraCaS test suite: Grammatical Framework meets Coq	N/A	Summary The FraCaS Consortium's "A Framework for Computational Semantics" outlines three approaches to building a framework for computational semantics: logical, conceptual, and toolbox. 1. Logical Framework: This involves creating a neutral, canonical representation language to compare different semantic theories. It discusses the practical and methodological difficulties in selecting a neutral canonical representation and the level at which comparisons should be made. Various semantic theories, like situation theory and dynamic logic, are encoded within a unified framework to measure their overlap and differences. 2. Conceptual Framework: It addresses the basic concepts within computational semantics, aiming to identify areas of agreement among different approaches. This includes creating a glossary of common semantic terms, surveying central semantic phenomena, and describing common semantic/logical concepts needed across different theories. 3. Toolbox Approach: It refers to a set of techniques, algorithms, or representations that can be incorporated into implementations. This approach underscores the challenge of developing tools that are independent of specific theoretical assumptions. The toolbox includes specific examples like the Hobbs-Shieber quantifier scoping algorithm and aspectual coercion calculus. Overall, the FraCaS project aims to encourage computational semantics towards a more structured and unified field, facilitating the development of more sophisticated and functional systems.

Generative AI Impactful Projects

Publication Year	Name	Description	Paper Link	GitHub Link	Summary
2023	Fairseq (Meta)	Toolkit for LLM's	various (see GitHub page)		Summary Fairseq, by Facebook AI Research, is an extensible Python toolkit for sequence modeling, particularly for tasks like translation, summarization, language modeling, and other text generation tasks. It has a variety of implemented papers and offers multi-GPU training, fast generation on CPU/GPU, mixed precision training, and extensive configuration options. Recent updates include new models and code releases for diverse speech technology, unsupervised speech recognition, and more. It integrates with xFormers and provides pre-trained models through a torch.hub interface. For installation, it requires PyTorch (version >= 1.10.0), Python (version >= 3.8), and NVIDIA GPU for model training. Fairseq's versatility, community support, and continuous updates make it a powerful resource for researchers and developers in AI and machine learning fields.
2023	T5X	Toolkit for LLM's			Summary t5x and seqio are libraries developed to facilitate the scaling of large language models, addressing challenges in computation distribution and data management. t5x simplifies building and training Transformer models, leveraging JAX and Flax for efficient scaling. It supports data, parameter, and activation partitioning via XLA GSPMD and `jax.pjit`. seqio, on the other hand, manages data pipelines and evaluations through a task-based API, featuring deterministic pipelines for improved training robustness. These libraries have been widely adopted for both research and product applications, with ongoing development and improvements based on user feedback, demonstrating their practicality and impact in the field of large-scale machine learning.
2022	Tailor	Generating Text with Semantic Controls			Summary Tailor is a text generation system that uses semantic controls for creating or modifying textual content, improving model robustness.
2022	NLI spurious dataset	Mitigating Spurious Correlations in NLI Datasets			Summary The paper discusses generating a debiased version of NLI datasets to train models without biases and improve generalization across various test sets.
2020	Octopus Paper	Climbing towards NLU: On Meaning, Form, and Understanding in the Age of Data		N/A	Summary This paper challenges the notion that large neural language models (LMs), like BERT, understand language meaning by merely training on linguistic form (text data). It argues that understanding necessitates relating linguistic forms to communicative intents external to language, a dimension not captured by current LMs. The paper draws parallels with human language acquisition, highlighting the role of interaction and real-world grounding, absent in machine learning approaches. Thought experiments illustrate the limitations of learning meaning from form. The authors call for more grounded, realistic approaches in natural language understanding (NLU) research, emphasizing accuracy in claims about language model capabilities and the importance of contextual and real-world learning.
2019	RoBERTa	A Robustly Optimized BERT Pretraining Approach			Summary "RoBERTa" by Yinhan Liu and others proposes an improved training method for BERT, achieving state-of-the-art results on several benchmark tasks.
2019	Right for the Wrong Reasons	Diagnosing Syntactic Heuristics in NLI			Summary The study explores machine learning systems' reliance on syntactic heuristics in NLI and introduces the HANS evaluation set.
2019	T5 (Meta)	text to text transfer transformer			Summary This paper introduces a unified framework for NLP transfer learning by converting various language tasks into a text-to-text format, implemented through the Text-to-Text Transfer Transformer (T5). It involves pre-training on the Colossal Clean Crawled Corpus (C4), a large English text dataset. The study systematically explores different aspects of transfer learning, including varying model architectures, pre-training objectives, characteristics of unlabeled datasets, and fine-tuning methods. A particular focus is on the impact of scaling model and data sizes. The integration of insights from these explorations with large-scale applications enables the achievement of state-of-the-art results on a range of language understanding benchmarks.

Useful AI tools (2024) - There's literally an AI for everything

- Explore CodeLlama, a platform for LLM chatbots.
- Learn about OpenAI's ChatGPT technology.
- Discover Claude's AI chat capabilities.
- Experience Bing Chat powered by GPT-4.
- Introduction to Gemini, Google's AI technology.
- "2023 Generative AI Landscape" on LinkedIn by Florian Belschner.
- "The Generative AI Tools Landscape" on DataCamp.

Links to other useful GitHub pages

Awesome-Efficient-LLM - A comprehensive collection of resources on Efficient Large Language Models.
AnalogyTools - Repository dedicated to tools for analogy-based learning and reasoning in AI.
Awesome-Knowledge-Graph - A curated list of awesome knowledge graph resources, papers, and tools.

Books

Publication Year	Name	Description	Paper Link	GitHub Link	Summary
2023	Neuro-Symbolic AI	Design transparent and trustworthy systems that understand the world as you do		N/A	Summary "Neuro-Symbolic AI: Design Transparent and Trustworthy Systems That Understand the World as You Do" by Alexiei Dingli and David Farrugia explores the emerging field of neuro-symbolic AI (NSAI), which combines the strengths of symbolic AI and neural networks to create systems capable of human-like reasoning. The book begins with a historical overview of AI, covering the evolution and limitations of traditional AI, symbolic AI, and neural networks. It highlights the need for explainable AI (XAI), emphasizing transparency and interpretability as critical components for the future of AI systems. The authors delve into the principles and mechanics of NSAI, presenting it as the next level of AI that addresses the shortcomings of purely symbolic or neural approaches. NSAI leverages the pattern recognition capabilities of neural networks and the reasoning abilities of symbolic AI, creating hybrid models that offer both high performance and transparency. The book discusses various NSAI architectures, such as the Neuro-Symbolic Concept Learner (NSCL) and Neuro-Symbolic Dynamic Reasoning (NSDR), showcasing their applications in fields like health, education, and finance. Practical programming techniques for implementing NSAI are provided, with hands-on examples in Python to help readers apply the concepts in real-world scenarios. The authors also explore future AI developments, including quantum computing, neuromorphic engineering, and brain-computer interaction, while addressing the ethical implications and challenges associated with these advancements. Targeted at data scientists, ML engineers, and AI enthusiasts, this book aims to provide a comprehensive understanding of NSAI and its potential to revolutionize AI systems. By combining theory with practical applications and real-world examples, Dingli and Farrugia offer a valuable resource for those looking to stay at the forefront of AI innovation
2019	Rebooting AI	Building Artificial Intelligence we can Trust	[	N/A	Summary "Rebooting AI" by Gary Marcus and Ernest Davis addresses the substantial gap between the current state of artificial intelligence (AI) and the aspirational goals of creating safe, smart, and reliable AI. This gap, termed the "AI Chasm," is attributed to several challenges. The first is the "gullibility gap," where humans mistakenly attribute more intelligence to AI systems than is justified. This is due to our tendency to anthropomorphize machines based on superficial behaviors that mimic human actions. Current AI systems, particularly those leveraging deep learning, have achieved remarkable successes in narrow applications but fail to incorporate broader, more abstract knowledge. This limitation is critical because AI systems trained on biased data sets can perpetuate and even amplify historical biases. This is evident in various fields, including medical diagnostics and autonomous driving, where such biases can have serious consequences. The book emphasizes the importance of common sense in AI, a capability that remains elusive yet essential for machines to understand and interact effectively with the world. The authors argue that current AI cannot construct and use cognitive models similar to those humans use, which is crucial for tasks requiring flexible thinking and understanding complex texts. To address these issues, the authors advocate for a paradigm shift in AI development. This involves building systems that can learn and represent the core frameworks of human knowledge, such as time, space, causality, and human interactions. Additionally, they emphasize the need for trustworthy AI systems that can be relied upon in high-stakes scenarios. Ultimately, the book envisions a future where AI, grounded in reasoning and common-sense values, can significantly transform society. Achieving this will require overcoming the current limitations and biases, and developing AI that can adapt to new and unforeseen situations as flexibly as humans do.

Usage

This repository primarily serves as a collection of links and references. You can explore the content by browsing the directories or using the search functionality on GitHub. Feel free to use the information here for your research, projects, or personal learning.

Contributing

If you have valuable resources, papers, or articles related to neuro-symbolic AI and cognition within AI, we encourage you to contribute to this repository. Please follow the guidelines in our Contribution Guidelines.

License

This repository is open-source and is available under the GNU Licence. Feel free to use and share the content while respecting the terms of the license.

We hope this repository helps you on your journey to understanding and exploring the exciting world of neuro-symbolic AI and cognition within AI. If you have any questions or suggestions, don't hesitate to reach out.

Thank you for visiting!

Contact Information

For inquiries or suggestions, please contact Brandon Colelough.

Name		Name	Last commit message	Last commit date
Latest commit History 222 Commits
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeuroAI-Cognition-Hub

Neuro-Symbolic AI and Cognition Links

Table of Contents

About Neuro-Symbolic AI

Cognition in AI

The Grounding Problem

The Common Model of Cognition

Featured

Upcoming Conferences / Symposia

Survey Papers

Neuro-symbolic AI major projects

Symbolic Language

Symbolic Reasoning AI major projects

Knowledge representation major projects

Cognitive Architectures and Generative Models

Common Model of Cognition

Memory AI major projects

Meta-level control major projects

Benchmarks

Generative AI Impactful Projects

Useful AI tools (2024) - There's literally an AI for everything

Links to other useful GitHub pages

Books

Usage

Contributing

License

Contact Information

About

Releases

Packages

License

Brandonio-c/NeuroAI-Cognition-Hub

Folders and files

Latest commit

History

Repository files navigation

NeuroAI-Cognition-Hub

Neuro-Symbolic AI and Cognition Links

Table of Contents

About Neuro-Symbolic AI

Cognition in AI

The Grounding Problem

The Common Model of Cognition

Featured

Upcoming Conferences / Symposia

Survey Papers

Neuro-symbolic AI major projects

Symbolic Language

Symbolic Reasoning AI major projects

Knowledge representation major projects

Cognitive Architectures and Generative Models

Common Model of Cognition

Memory AI major projects

Meta-level control major projects

Benchmarks

Generative AI Impactful Projects

Useful AI tools (2024) - There's literally an AI for everything

Links to other useful GitHub pages

Books

Usage

Contributing

License

Contact Information

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Packages