Skip to content

Awesome agi cocosci

Roadmap of studying Abduction

Awesome Artificial General Intelligence and Computational Cognitive Sciences Awesome

An awesome & curated list for Artificial General Intelligence, an emerging inter-discipline field that combines artificial intelligence and computational cognitive sciences as majority, alone with probability and statistics, formal logic, cognitive and developmental psychology, computational philosophy, cognitive neuroscience, and computational sociology. We are promoting high-level machine intelligence by getting inspirations from the way that human learns and thinks, while obtaining a deeper understanding of human cognition simultaneously. We believe that this kind of reciprocative research is a potential way towards our big picture: building human-level intelligent systems with capabilities such as abstracting, explaining, learning, planning, and making decisions. And such intelligence may generally help people improve scientific research, engineering, and the arts, which are the hallmarks of human intelligence.

Awesome AGI & CoCoSci is an all-in-one collection, consisting of recources from basic courses and tutorials, to papers and books around diverse topics in mutiple perspectives. Both junior and senior researchers, whether learning, working on, or working around AGI and CoCoSci, meet their interest here.

Contributing

Contributions are greatly welcomed! Please refer to Contribution Guidelines before taking any action.

  * [Quantitative Analysis](#quantitative-analysis) 
<!--* [Tasks & Environments](#te)-->

Papers

Abduction

Explanation

Scientific Discovery

Rationalization

  • Imagination and the generation of new ideas - Cognitive Development, 2015. [All Versions]. A piece of evidence for rationalization in childhood.

  • Coalescing the Vapors of Human Experience into a Viable and Meaningful Comprehension - CogSci'16, 2016. [All Versions]. Constrainted thinking as rationalization.

  • How We Know What Not To Think - Trends in Cognitive Sciences, 2019. [All Versions]. A comprehensive review on rationalization.

  • Rationalization is rational - Behavioral and Brain Sciences, 2020. [All Versions]. [Preprint]. Rationalization occurs when a person has performed an action and then concocts the beliefs and desires that would have made it rational. Then, people often adjust their own beliefs and desires to match the concocted ones. While many studies demonstrate rationalization, and a few theories describe its underlying cognitive mechanisms, we have little understanding of its function. Why is the mind designed to construct post hoc rationalizations of its behavior, and then to adopt them? This may accomplish an important task: transferring information between the different kinds of processes and representations that influence our behavior. Human decision making does not rely on a single process; it is influenced by reason, habit, instinct, norms, and so on. Several of these influences are not organized according to rational choice (i.e., computing and maximizing expected value). Rationalization extracts implicit information – true beliefs and useful desires – from the influence of these non-rational systems on behavior.

  • Rationalizing constraints on the capacity for cognitive control - Trends in Cognitive Sciences, 2021. [All Versions]. Humans are remarkably limited in: (i) how many control-dependent tasks they can execute simultaneously, and (ii) how intensely they can focus on a single task. These limitations are universal assumptions of most theories of cognition. Yet, a rationale for why humans are subject to these constraints remains elusive. This feature review draws on recent insights from psychology, neuroscience, and machine learning, to suggest that constraints on cognitive control may result from a rational adaptation to fundamental, computational dilemmas in neural architectures. The reviewed literature implies that limitations in multitasking may result from a trade-off between learning efficacy and processing efficiency and that limitations in the intensity of commitment to a single task may reflect a trade-off between cognitive stability and flexibility.

  • Why Imaginary Worlds? The psychological foundations and cultural evolution of fictions with imaginary worlds - Behavioral and Brain Sciences, 2021. [All Versions]. A review of rationalization as imaginary worlds in fictions. The perspective proposes that imaginary worlds co-opt our preferences for exploration, which have evolved in humans and nonhuman animals alike, to propel individuals toward new environments and new sources of reward.

Applications in AI

  • Functional genomic hypothesis generation and experimentation by a robot scientist - Nature, 2004. [All Versions]. This paper describes a physically implemented robotic system that applies techniques from artificial intelligence to carry out cycles of scientific experimentation. The system automatically originates hypotheses to explain observations, devises experiments to test these hypotheses, physically runs the experiments using a laboratory robot, interprets the results to falsify hypotheses inconsistent with the data, and then repeats the cycle. The system is applied to the determination of gene function using deletion mutants of yeast (Saccharomyces cerevisiae) and auxotrophic growth experiments. The authors built and tested a detailed logical model (involving genes, proteins and metabolites) of the aromatic amino acid synthesis pathway.

  • Interpretation as abduction - Artificial Intelligence, 1993. [All Versions]. Abduction is inference to the best explanation. The authors have developed an approach to abductive inference, called “weighted abduction”, that has resulted in a significant simplification of how the problem of interpreting texts is conceptualized. The interpretation of a text is the minimal explanation of why the text would be true. More precisely, to interpret a text, one must prove the logical form of the text from what is already mutually known, allowing for coercions, merging redundancies where possible, and making assumptions where necessary. It is shown how such “local pragmatics” problems as reference resolution, the interpretation of compound nominals, the resolution of syntactic ambiguity and metonymy, and schema recognition can be solved in this manner. Moreover, this approach of “interpretation as abduction” can be combined with the older view of “parsing as deduction” to produce an elegant and thorough integration of syntax, semantics, and pragmatics, one that spans the range of linguistic phenomena from phonology to discourse structure.

  • Probabilistic Horn abduction and Bayesian networks - Artificial Intelligence, 1993. [All Versions]. This paper presents a simple framework for Horn-clause abduction, with probabilities associated with hypotheses. The framework incorporates assumptions about the rule base and independence assumptions amongst hypotheses. It is shown how any probabilistic knowledge representable in a discrete Bayesian belief network can be represented in this framework. The main contribution is in finding a relationship between logical and probabilistic notions of evidential reasoning. This provides a useful representation language in its own right, providing a compromise between heuristic and epistemic adequacy.

  • Abductive Inference in Bayesian Networks: A Review - Advances in Bayesian Networks, 2004. [All Versions]. The goal of this paper is to serve as a survey for the problem of abductive inference (or belief revision) in Bayesian networks. Thus, the problem is introduced in its two variants: total abduction (or MPE) and partial abduction (or MAP) . Also, the problem is formulated in its general case, that is, looking for the K best explanations. Then, a (non exhaustive) review of exact and approximate algorithms for dealing with both abductive inference problems is carried out. Finally, the authors collect the main complexity results appeared in the literature for both problems (MPE and MAP).

  • Abductive Logic Programming - Journal of Logic Computation, 1992. [All Versions]. This paper is a survey and critical overview of recent work on the extension of logic programming to perform abductive reasoning (abductive logic programming). The authors outline the general framework of abduction and its applications to knowledge assimilation and default reasoning; and they introduce an argumentation-theoretic approach to the use of abduction as an interpretation for negation as failure.

  • ACLP: Abductive Constraint Logic Programming - The Journal of Logic Programming, 1999. [All Versions]. This paper presents the framework of Abductive Constraint Logic Programming (ACLP), which integrates Abductive Logic Programming (ALP) and Constraint Logic Programming (CLP). In ACLP, the task of abduction is supported and enhanced by its non-trivial integration with constraint solving. This integration of constraint solving into abductive reasoning facilitates a general form of constructive abduction and enables the application of abduction to computationally demanding problems. The paper studies the formal declarative and operational semantics of the ACLP framework together with its application to various problems.

  • Abduction in Logic Programming - Computational Logic, 2002. [All Versions]. [Preprint]. Abduction in Logic Programming started in the late 80s, early 90s, in an attempt to extend logic programming into a framework suitable for a variety of problems in Artificial Intelligence and other areas of Computer Science. This paper aims to chart out the main developments of the field over the last ten years and to take a critical view of these developments from several perspectives: logical, epistemological, computational and suitability to application. The paper attempts to expose some of the challenges and prospects for the further development of the field.

  • Bayesian Abductive Logic Programs: A Probabilistic Logic for Abductive Reasoning - IJCAI'11, 2011. [All Versions]. [Preprint]. This work introduces Bayesian Abductive Logic Programs (BALP), a probabilistic logic that adapts Bayesian Logic Programs (BLPs) for abductive reasoning. Like BLPs, BALPs also combine first-order logic and Bayes nets. However, unlike BLPs, which use deduction to construct Bayes nets, BALPs employ logical abduction. As a result, BALPs are more suited for problems like plan/activity recognition that require abductive reasoning.

  • Abductive Plan Recognition by Extending Bayesian Logic Programs - ECML'11, 2011. [All Versions].

  • An Approach to Abductive Reasoning in Equational Logic - IJCAI'13, 2013. [All Versions].

  • Abduction-Based Explanations for Machine Learning Models - AAAI'19, 2019. [All Versions].

  • Probabilistic Sufficient Explanations - IJCAI'21, 2021. [All Versions].

  • Machine Translation Using Abductive Inference - COLING, 1990. [All Versions]. An application of abduction in language translating.

  • Automated Biodesign Engineering by Abductive Meta-Interpretive Learning - AAAI Spring Symposium Series 2021 on Artificial Intelligence for Synthetic Biology, 2021. [All Versions]. This work proposes an automated biodesign engineering framework empowered by Abductive Meta-Interpretive Learning (MetaAbd), a novel machine learning approach that combines symbolic and sub-symbolic machine learning, to further enhance the design-build-test-learn cycle by enabling the learning machine to 1) exploit domain knowledge and learn human-interpretable models that are expressed by formal languages such as first-order logic; 2) simultaneously optimise the structure and parameters of the models to make accurate numerical predictions; 3) reduce the cost of experiments and effort on data annotation by actively generating hypotheses and examples.

  • Human Comprehensible Active Learning of Genome-Scale Metabolic Networks - AAAI Spring Symposium Series 2023 on Computational Scientific Discovery, 2023. [All Versions]. [Extended Abstract]. [Slides]. This work introduces a novel machine learning framework ILP-iML1515 based on Inductive Logic Programming (ILP) that performs abductive logical reasoning and actively learns from training examples. The ILP-iML1515 framework 1) allows high-throughput simulations and 2) actively selects experiments that reduce the experimental cost of learning gene functions in comparison to randomly selected experiments.

Bayesian Modeling

Bayesian Induction

  • Bayesian Epistemology - Plato Stanford. A computational philosophy account on the nature of uncertainty modeling in Bayesian Epistemology.

  • Probabilistic machine learning and artificial intelligence - Nature, 2015. [All Versions]. Probabilistic modelling provides a framework for understanding what learning is, and has therefore emerged as one of the principal theoretical and practical approaches for designing machines that learn from data acquired through experience. The probabilistic framework, which describes how to represent and manipulate uncertainty about models and predictions, has a central role in scientific data analysis, machine learning, robotics, cognitive science and artificial intelligence. This Review provides an introduction to this framework, and discusses some of the state-of-the-art advances in the field, namely, probabilistic programming, Bayesian optimization, data compression and automatic model discovery.

  • Generalization, similarity, and Bayesian inference - Behavioral and Brain Sciences, 2001. [All Versions]. [Preprint]. Shepard has argued that a universal law should govern generalization across different domains of perception and cognition, as well as across organisms from different species or even different planets. Starting with some basic assumptions about natural kinds, he derived an exponential decay function as the form of the universal generalization gradient, which accords strikingly well with a wide range of empirical data. However, his original formulation applied only to the ideal case of generalization from a single encountered stimulus to a single novel stimulus, and for stimuli that can be represented as points in a continuous metric psychological space. The authors recast Shepard's theory in a more general Bayesian framework and show how this naturally extends his approach to the more realistic situation of generalizing from multiple consequential stimuli with arbitrary representational structure. This framework also subsumes a version of Tversky's set-theoretic model of similarity, which is conventionally thought of as the primary alternative to Shepard's continuous metric space model of similarity and generalization.

  • Bayesian modeling of human concept learning - NeurIPS'98, 1998. [All Versions]. [Preprint]. This work considers the problem of learning concepts from small numbers of positive examples, a feat which humans perform routinely but which computers are rarely capable of. Bridging machine learning and cognitive science perspectives, this work presents both theoretical analysis and an empirical study with human subjects for the simple task oflearning concepts corresponding to axis-aligned rectangles in a multidimensional feature space. Existing learning models, when applied to this task, cannot explain how subjects generalize from only a few examples of the concept. The author proposes a principled Bayesian model based on the assumption that the examples are a random sample from the concept to be learned. The model gives precise fits to human behavior on this simple task and provides qualitati ve insights into more complex, realistic cases of concept learning.

  • Rules and Similarity in Concept Learning - NeurIPS'99, 1999. [All Versions]. [Preprint]. This paper argues that two apparently distinct modes of generalizing concepts - abstracting rules and computing similarity to exemplars - should both be seen as special cases of a more general Bayesian learning framework. Bayes explains the specific workings of these two modes - which rules are abstracted, how similarity is measured - as well as why generalization should appear rule- or similarity-based in different situations. This analysis also suggests why the rules/similarity distinction, even if not computationally fundamental, may still be useful at the algorithmic level as part of a principled approximation to fully Bayesian learning.

  • Theory-based Bayesian models of inductive learning and reasoning - Trends in Cognitive Sciences, 2006. [All Versions]. [Preprint]. Inductive inference allows humans to make powerful generalizations from sparse data when learning about word meanings, unobserved properties, causal relationships, and many other aspects of the world. Traditional accounts of induction emphasize either the power of statistical learning, or the importance of strong constraints from structured domain knowledge, intuitive theories or schemas. This paper argues that both components are necessary to explain the nature, use and acquisition of human knowledge, and the authors introduce a theory-based Bayesian framework for modeling inductive learning and reasoning as statistical inferences over structured knowledge representations.

  • Word learning as Bayesian inference - Psychological Review, 2007. [All Versions]. [APA]. Fei Xu's review on Bayesian word learning.

  • How to Grow a Mind: Statistics, Structure, and Abstraction - Science, 2011. [All Versions]. Josh Tenenbaum's review on Bayesian theory induction.

  • Human-level concept learning through probabilistic program induction. - Science, 2015. [All Versions]. [Supplementary Material]. Bayesian program induction for few-shot learning.

  • Building Machines That Learn and Think Like People - Behavioral and Brain Sciences, 2017. [All Versions]. Brenden Lake and Josh Tenenbaum's review on Bayesian modeling.

  • Building machines that learn and think with people - Nature Human Behavior, 2024. [All Versions]. [Preprint]. This perspective shows how the science of collaborative cognition can be put to work to engineer systems that really can be called ‘thought partners’, systems built to meet humans' expectations and complement humans' limitations. The authors lay out several modes of collaborative thought in which humans and artificial intelligence thought partners can engage, and they propose desiderata for human-compatible thought partnerships. Drawing on motifs from computational cognitive science, this work motivates an alternative scaling path for the design of thought partners and ecosystems around their use through a Bayesian lens, whereby the constructed partners actively build and reason over models of the human and world.

  • The rational basis of representativeness - CogSci'01, 2001. [All Versions].

  • Testing a Bayesian Measure of Representativeness Using a Large Image Database - NeurIPS'11, 2011. [All Versions].

  • Constructing a hypothesis space from the Web for large-scale Bayesian word learning - CogSci'12, 2012. [All Versions].

  • Human-level few-shot concept induction through minimax entropy learning - Science Advances, 2024. [All Versions]. This paper introduces a computational model designed to emulate human inductive reasoning on abstract reasoning tasks, such as those in IQ tests, using a minimax entropy approach. This method combines identifying the most effective constraints on data via minimum entropy with determining the best combination of them via maximum entropy.

Generative Model

Nonparametric Model

Bayesian Optimization

Concepts

Theory of Concepts

Human Concept Representation

AI Concept Representation

Complexity & Information Theory

Theory

Dimensionality Reduction

Visual Complexity

Communications

Non-Verbal Communication

Pragmatics

Language Compositionality

Coordination

  • In situ bidirectional human-robot value alignment - Science Robotics, 2022. [All Versions]. [Preprint]. This paper proposes an explainable artificial intelligence (XAI) system in which a group of robots predicts users’ values by taking in situ feedback into consideration while communicating their decision processes to users through explanations. To learn from human feedback, the XAI system integrates a cooperative communication model for inferring human values associated with multiple desirable goals. To be interpretable to humans, it simulates human mental dynamics and predicts optimal explanations using graphical models.

  • From Explicit Communication to Tacit Cooperation: A Novel Paradigm for Cooperative MARL - AAMAS'24, 2024. [All Versions]. Drawing inspiration from human team cooperative learning, this paper proposes a novel paradigm that facilitates a gradual shift from explicit communication to tacit cooperation.

Domain Specific Language

Design Theory

  • Domain-Specific Language - Wikipedia. Wikipedia encyclopedia entry on Domain Specific Languages.

  • Domain Engineering - Wikipedia. Wikipedia encyclopedia entry on Domain Engineering.

  • Domain-Specific Languages - Pearson Education, 2010. [All Versions]. [Domain-Specific Languages Guide]. When carefully selected and used, Domain-Specific Languages (DSLs) may simplify complex code, promote effective communication with customers, improve productivity, and unclog development bottlenecks. In Domain-Specific Languages, noted software development expert Martin Fowler first provides the information software professionals need to decide if and when to utilize DSLs. Then, where DSLs prove suitable, Fowler presents effective techniques for building them, and guides software engineers in choosing the right approaches for their applications.

  • Comparison of multi-paradigm programming languages - Wikipedia. Programming languages may support multiple programming paradigms. This Wikipedia encyclopedia entry lists a concise reference for the programming paradigms.

  • Epigrams on programming - ACM SIGPLAN Notices, 1982. [All Versions].

  • The complete guide to (external) Domain Specific Languages. An introduction to Domain Specific Languages (DSL) based on 19 DSL cases.

  • When and How to Develop Domain-Specific Languages - ACM Computing Surveys, 2005. [All Versions]. [Preprint]. Domain-specific languages (DSLs) are languages tailored to a specific application domain. They offer substantial gains in expressiveness and ease of use compared with general-purpose programming languages in their domain of application. DSL development is hard, requiring both domain knowledge and language development expertise. Few people have both. Not surprisingly, the decision to develop a DSL is often postponed indefinitely, if considered at all, and most DSLs never get beyond the application library stage. Although many articles have been written on the development of particular DSLs, there is very limited literature on DSL development methodologies and many questions remain regarding when and how to develop a DSL. To aid the DSL developer, this survey paper identifies patterns in the decision, analysis, design, and implementation phases of DSL development. These patterns improve and extend earlier work on DSL design patterns.

  • Design Guidelines for Domain Specific Languages - OOPSLA Workshop on Domain-Specific Modeling (DSM' 09), 2009. [All Versions]. Designing a new domain specific language is as any other complex task sometimes error-prone and usually time consuming, especially if the language shall be of high-quality and comfortably usable. Existing tool support focuses on the simplification of technical aspects but lacks support for an enforcement of principles for a good language design. In this paper we investigate guidelines that are useful for designing domain specific languages, largely based on our experience in developing languages as well as relying on existing guidelines on general purpose (GPLs) and modeling languages. This work defined Guidelines to support a DSL developer to achieve better quality of the language design and a better acceptance among its users.

  • Domain-specific languages: an annotated bibliography - ACM SIGPLAN Notices, 2000. [All Versions]. A survey on the topic of domain-specific languages as used for the construction and maintenance of software systems. The survey lists a selection of 75 key publications in the area, and provides a summary for each of the papers. Moreover, the survey discusses terminology, risks and benefits, example domain-specific languages, design methodologies, and implementation techniques.

  • Usability Evaluation of Domain-Specific Languages - ICQICT'12, 2012. [All Versions]. [Preprint]. The purpose of this proposal is to contribute to the systematic activity of Software Language Engineering by focusing on the issue of the Usability evaluation of DSLs. Usability evaluation is often skipped, relaxed, or at least omitted from papers reporting development of DSLs. The authors argue that a systematic approach based on User Interface experimental validation techniques should be used to assess the impact of new DSLs. For that purpose, the authors propose to merge common Usability evaluation processes with the DSL development process.

Design Practises

  • No Grammar to Rule Them All: A Survey of JSON-style DSLs for Visualization - IEEE Transactions on Visualization and Computer Graphics, 2022. [All Versions]. There has been substantial growth in the use of JSON-based grammars, as well as other standard data serialization languages, to create visualizations. Each of these grammars serves a purpose: some focus on particular computational tasks (such as animation), some are concerned with certain chart types (such as maps), and some target specific data domains (such as ML). Despite the prominence of this interface form, there has been little detailed analysis of the characteristics of these languages. This study surveys and analyzes the design and implementation of 57 JSON-style DSLs for visualization. The authors analyze these languages supported by a collected corpus of examples for each DSL (consisting of 4395 instances) across a variety of axes organized into concerns related to domain, conceptual model, language relationships, affordances, and general practicalities. The authors identify tensions throughout these areas, such as between formal and colloquial specifications, among types of users, and within the composition of languages. Through this work, the authors seek to support language implementers by elucidating the choices, opportunities, and tradeoffs in visualization DSL design.

  • Quantifying usability of domain-specific languages: An empirical study on software maintenance - Journal of Systems and Software, 2015. [All Versions]. A DSL aims to support software development by offering abstractions to a particular domain. It is expected that DSLs improve the maintainability of artifacts otherwise produced with general-purpose languages. However, the maintainability of the DSL artifacts and, hence, their adoption in mainstream development, is largely dependent on the usability of the language itself. Unfortunately, it is often hard to identify their usability strengths and weaknesses early, as there is no guidance on how to objectively reveal them. Usability is a multi-faceted quality characteristic, which is challenging to quantify beforehand by DSL stakeholders. There is even less support on how to quantitatively evaluate the usability of DSLs used in maintenance tasks. In this context, this paper reports a study to compare the usability of textual DSLs under the perspective of software maintenance. A usability measurement framework was developed based on the cognitive dimensions of notations. The framework was evaluated both qualitatively and quantitatively using two DSLs in the context of two evolving object-oriented systems. The results suggested that the proposed metrics were useful: (1) to early identify DSL usability limitations, (2) to reveal specific DSL features favoring maintenance tasks, and (3) to successfully analyze eight critical DSL usability dimensions.

  • How Domain Experts Use an Embedded DSL - OOPSLA'23, 2023. [All Versions]. Programming tools are increasingly integral to research and analysis in myriad domains, including specialized areas with no formal relation to computer science. Embedded domain-specific languages (eDSLs) have the potential to serve these programmers while placing relatively light implementation burdens on language designers. However, barriers to eDSL use reduce their practical value and adoption. This work aims to deepen the understanding of how programmers use eDSLs and identify user needs to inform future eDSL designs. The authors performed a contextual inquiry (9 participants) with domain experts using Mimi, an eDSL for climate change economics modeling. A thematic analysis identified five key themes, including: the interaction between the eDSL and the host language has significant and sometimes unexpected impacts on eDSL user experience, and users preferentially engage with domain-specific communities and code templates rather than host language resources.

Design Automation

  • AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints - ACL'24, 2024. [All Versions]. [Project]. The original paper on the automated design of DSLs. This paper introduces the AutoDSL framework to automate DSL-based constraint design across various domains. Utilizing domain specified experimental protocol corpora, AutoDSL optimizes syntactic constraints and abstracts semantic constraints. Quantitative and qualitative analyses of the DSLs designed by AutoDSL across five distinct domains highlight its potential as an auxiliary module for language models, aiming to improve procedural planning and execution.

Imperative DSL Applications

Declarative DSL Applications

  • The BioPAX community standard for pathway data sharing - Nature Biotechnology, 2010. [All Versions]. [Preprint]. Biological Pathway Exchange (BioPAX) is a standard language to represent biological pathways at the molecular and cellular level and to facilitate the exchange of pathway data. BioPAX can represent metabolic and signaling pathways, molecular and genetic interactions and gene regulation networks.

  • Learning the language of viral evolution and escape - Science, 2021. [All Versions]. Natural language processing with two components: grammar (or syntax) and meaning (or semantics) for predicting which viral mutations may lead to viral escape.

  • OpenLaw - OpenLaw.io. It is now possible to model all or parts of legal agreements using code (smart contracts), decreasing the cost and friction of creating, securing, and generating binding legal agreements. Lawyers lack basic tools to build these dynamic, “smart” contracts in a way that is enforceable and understandable to a legal professional. OpenLaw is a technology stack to help power next generation "smart" legal agreements, with a domain-specific markup language, a integration framework, and a series of general applications.

  • Scenic: a language for scenario specification and data generation - Machine Learning, 2022. [All Versions]. This paper proposes a domain-specific language, Scenic, for describing scenarios that are distributions over scenes and the behaviors of their agents over time. Scenic combines concise, readable syntax for spatiotemporal relationships with the ability to declaratively impose hard and soft constraints over the scenario.

  • Domain Specific Language for Smart Contract Development - ICBC'20, 2020. [All Versions]. [Preprint]. This research addresses the understanding hardness raised from the conceptual discrepancy between contractual clauses and corresponding code of the Solidity programming language, by the design and study of a domain-specific smart contract language based on higher level of abstraction that can be automatically transformed to an implementation.

  • iContractML 2.0: A domain-specific language for modeling and deploying smart contracts onto multiple blockchain platforms - Information and Software Technology, 2022. [All Versions]. Smart contracts play a vital role in many fields. Despite being called smart, the development of smart contracts is a tedious task beyond defining a set of contractual rules. In addition to business knowledge, coding a smart contract requires strong technical knowledge in a multiplex of new and rapidly changing domain-specific languages and blockchain platforms. The goal of this paper is to assist developers in building smart contracts independently from the language or the target blockchain platform. In which, this paper presents the second-generation smart contract language iContractML 2.0. iContractML 2.0 is an extensible framework that empowers developers to model and generate functional smart contract code that can be deployed onto multiple blockchain platforms.

  • PClean: Bayesian Data Cleaning at Scale with Domain-Specific Probabilistic Programming - ICML'21, 2021. [All Versions]. This work presents PClean, a probabilistic programming language (PPL) for leveraging dataset-specific knowledge to automate Bayesian cleaning, automating Bayesian approaches given the diversity of real-world error patterns and the hardness of inference.

  • A Language for Counterfactual Generative Models - ICML'21, 2021. [All Versions]. [Project]. This paper presents Omega, a probabilistic programming language with support for counterfactual inference. This feature is accomplished by introducing a new operator to probabilistic programming akin to Pearl’s do.

  • Product Line Engineering Using Domain-Specific Languages - ISPLC'11, 2011. [All Versions]. [Preprint]. This paper investigates the application of domain-specific languages in product line engineering (PLE).

  • A Domain-Specific Language for Product-Process-Resource Modeling - ETFA'21, 2021. [All Versions]. This paper presents the design of the PPR-DSL to effectively and efficiently represent Product-Process-Resource (PPR) aspects and evaluate constraints defined for modeling PPR views in the Formalized Process Description standard (VDI 3682).

  • The Scene Language: Representing Scenes with Programs, Words, and Embeddings - 2024. [All Versions]. [Project]. This paper introduces the Scene Language, a visual scene representation that concisely and precisely describes the structure, semantics, and identity of visual scenes. It represents a scene with three key components: a program that specifies the hierarchical and relational structure of entities in the scene, words in natural language that summarize the semantic class of each entity, and embeddings that capture the visual identity of each entity. This representation can be inferred from pre-trained language models via a training-free inference technique, given text or image inputs.

Logic DSL Applications

  • Situation Calculus - Wikipedia. Wikipedia on Situation Calculus, a logic formalism designed for representing and reasoning about dynamical domains.

  • Answer Set Programming - ICLPNR'99, 1999. [All Versions]. [Preprint]. The original paper on Answer Set Programming (ASP), a form of declarative programming oriented towards difficult search problems, on the use of nonmonotonic reasoning in knowledge representation. In ASP solutions to a problem are represented by answer sets (known also as stable models), and not by answer substitutions produced in response to a query, as in conventional logic programming.

  • Action Languages, Answer Sets, and Planning - The Logic Programming Paradigms, 1999. [All Versions]. [Preprint]. This is a discussion of some of the achievements and challenges related to representing actions and the design of planners from the perspective of logic programming. The authors talk about recent work on action languages and translating them into logic programming, on representing possible histories of an action domain by answer sets, on efficient implementations of the answer set semantics and their use for generating plans, and on causal logic and its relation to planning algorithms. Recent progress in these areas may lead to the creation of planners which are based on the ideas of logic programming and combine the use of expressive action description languages with efficient computational procedures.

  • Qualitative Simulation - Artificial Intelligence, 1986. [All Versions]. [Preprint]. This paper presents a precise definition of qualitative structure and behavior descriptions as abstractions of differential equations and continuously differentiable functions. The authors present a new algorithm for qualitative simulation that generalizes the best features of existing algorithms, and allows direct comparisons among alternate approaches. Starting with a set of constraints abstracted from a differential equation, this work proves that the QSIM algorithm is guaranteed to produce a qualitative behavior corresponding to any solution to the original equation. The paper also shows that any qualitative simulation algorithm will sometimes produce spurious qualitative behaviors: ones which do not correspond to any mechanism satisfying the given constraints. These observations suggest specific types of care that must be taken in designing applications of qualitative causal reasoning systems, and in constructing and validating a knowledge base of mechanism descriptions.

  • Qualitative Reasoning: Modeling and Simulation with Incomplete Knowledge - MIT Press, 1994. [All Versions]. This book presents, within a conceptually unified theoretical framework, a body of methods that have been developed over the past fifteen years for building and simulating qualitative models of physical systems - bathtubs, tea kettles, automobiles, the physiology of the body, chemical processing plants, control systems, electrical systems - where knowledge of that system is incomplete. The primary tool for this work is the author's QSIM algorithm, which is discussed in detail. Qualitative models are better able than traditional models to express states of incomplete knowledge about continuous mechanisms. Qualitative simulation guarantees to find all possible behaviors consistent with the knowledge in the model. This expressive power and coverage is important in problem solving for diagnosis, design, monitoring, explanation, and other applications of artificial intelligence.

  • Qualitative and quantitative simulation: bridging the gap - Artificial Intelligence, 1997. [All Versions]. Shortcomings of qualitative simulation and of quantitative simulation motivate combining them to do simulations exhibiting strengths of both. The resulting class of techniques is called semiquantitative simulation. One approach to semi-quantitative simulation is to use numeric intervals to represent incomplete quantitative information. This research demonstrates semi-quantitative simulation using intervals in an implemented semi-quantitative simulator called Q3. Q3 progressively refines a qualitative simulation, providing increasingly specific quantitative predictions which can converge to a numerical simulation in the limit while retaining important correctness guarantees from qualitative and interval simulation techniques.

  • A Logic Programming Language for Computational Nucleic Acid Devices - ACS Synthetic Biology, 2018. [All Versions]. This paper presents a logic programming language that allows a broad range of computational nucleic acid systems to be designed and analyzed. The language extends standard logic programming with a novel equational theory to express nucleic acid molecular motifs. It automatically identifies matching motifs present in the full system, in order to apply a specified transformation expressed as a logical rule.

DSL Program Synthesis

  • pix2code: Generating Code from a Graphical User Interface Screenshot - ACM SIGCHI Symposium on Engineering Interactive Computing Systems, 2018. [All Versions]. [Code]. [Website]. This paper shows that deep learning methods can be leveraged to train a model end-to-end to automatically reverse engineer user interfaces and generate code from a single input image with over 77% of accuracy for three different platforms (i.e. iOS, Android and web-based technologies).

  • Learning to Infer Graphics Programs from Hand-Drawn Images - NeurIPS'18, 2018. [All Versions]. The method learns a model that uses program synthesis techniques to recover a graphics program from drawing primitives. These programs have constructs like variable bindings, iterative loops, or simple kinds of conditionals. With a graphics program in hand, we can correct errors made by the deep network and extrapolate drawings.

  • babble: Learning Better Abstractions with E-Graphs and Anti-unification - POPL'23, 2023. [All Versions]. This paper proposes library learning modulo theory (LLMT), a new library learning algorithm that additionally takes as input an equational theory for a given problem domain. LLMT uses e-graphs and equality saturation to compactly represent the space of programs equivalent modulo the theory, and uses a novel e-graph anti-unification technique to find common patterns in the corpus more directly and efficiently.

  • Top-Down Synthesis for Library Learning - POPL'23, 2023. [All Versions]. This paper introduces corpus-guided top-down synthesis as a mechanism for synthesizing library functions that capture common functionality from a corpus of programs in a domain specific language (DSL). The algorithm builds abstractions directly from initial DSL primitives, using syntactic pattern matching of intermediate abstractions to intelligently prune the search space and guide the algorithm towards abstractions that maximally capture shared structures in the corpus.

  • DreamCoder: growing generalizable, interpretable knowledge with wake–sleep Bayesian program learning - Philosophical Transactions of the Royal Society A, 2023. [All Versions]. [Preprint]. This paper presents DreamCoder, a system that learns to solve problems by writing programs. It builds expertise by creating domain-specific programming languages for expressing domain concepts, together with neural networks to guide the search for programs within these languages. A ‘wake–sleep’ learning algorithm alternately extends the language with new symbolic abstractions and trains the neural network on imagined and replayed problems. DreamCoder solves both classic inductive programming tasks and creative tasks such as drawing pictures and building scenes.

  • Grammar Prompting for Domain-Specific Language Generation with Large Language Models - NeurIPS'23, 2023. [All Versions]. Grammar prompting is a simple approach to enable LLMs to use external knowledge and domain-specific constraints expressed through a grammar in Backus--Naur Form (BNF) during in-context learning.

  • Errors are Useful Prompts: Instruction Guided Task Programming with Verifier-Assisted Iterative Prompting - 2023. [All Versions]. [Code]. [Website]. This paper proposes CLAIRIFY, an approach that combines automatic iterative prompting with program verification to ensure programs written in data-scarce domain-specific language are syntactically valid and incorporate environment constraints.

  • PhotoScout: Synthesis-Powered Multi-Modal Image Search - ACM SIGCHI'24, 2024. [All Versions]. This paper explores a new multi-modal image search approach that allows users to conveniently specify and perform semantic image search tasks. With the tool, PhotoScout, the user interactively provides natural language descriptions, positive and negative examples, and object tags to specify their search tasks. Under the hood, PhotoScout is powered by a program synthesis engine that generates visual queries in a domain-specific language and executes the synthesized program to retrieve the desired images.

Cognitive Foundations

  • The Child as Hacker - Trends in Cognitive Sciences, 2020. [All Versions]. The scope of human learning and development poses a radical challenge for cognitive science. The authors propose that developmental theories can address this challenge by adopting perspectives from computer science. Many of our best models treat learning as analogous to computer programming because symbolic programs provide the most compelling account of sophisticated mental representations. The authors specifically propose that children’s learning is analogous to a particular style of programming called hacking, making code better along many dimensions through an open-ended set of goals and activities. By contrast to existing theories, which depend primarily on local search and simple metrics, this view highlights the many features of good mental representations and the multiple complementary processes children use to create them.

  • Communicating Natural Programs to Humans and Machines - NeurIPS'22, 2022. [All Versions]. While humans readily generate and interpret instructions in a general language, computer systems are shackled to a narrow domain-specific language that they can precisely execute. This makes building intelligent systems that can generalize to novel situations such as ARC difficult. Human-generated instructions are referred as “natural programs”. While they resemble computer programs, they are distinct in two ways: First, they contain a wide range of primitives; Second, they frequently leverage communicative strategies beyond directly executable codes.

  • Symbolic metaprogram search improves learning efficiency and explains rule learning in humans - Nature Communications, 2024. [All Versions]. Symbolic models based on program learning successfully explain rule-learning in many domains, but performance degrades quickly as program complexity increases. It remains unclear how to scale symbolic rule-learning methods to model human performance in challenging domains. This work shows that symbolic search over the space of metaprograms—programs that revise programs—dramatically improves learning efficiency. On a behavioral benchmark of 100 algorithmically rich rules, this approach fits human learning more accurately than alternative models while also using orders of magnitude less search. The computation required to match median human performance is consistent with conservative estimates of human thinking time. The results suggest that metaprogram-like representations may help human learners to efficiently acquire rules.

Problem Solving

Human-Level Problem Solving

Planning

Intrinsic Motivation

Reinforcement Learning

Inverse Reinforcement Learning

System 1 & System 2

Dual-Coding Theory

Neural-Symbolic AI

Explainability

Trustworthy AI

Strong Machine Learning

Explainable Deep Learning

Embodied Intelligence

Evolutionary Intelligence

Methodologies for Experiments

Quantitative Analysis

Scaling Up Behavioral Studies

Decision Making

Question Answering

Human-Machine Comparison

Association Test

Virtual Reality

Meta-Level Considerations

Meta Learning

Marr's Levels of Analysis

Gestalt

The Aha! Moment

Rationality

Cognitive Architecture

Science Logology

Philosophy of Science

Science of Science

Literature Mining

Scientific Writing

Science Education

  • PersLEARN: Research Training through the Lens of Perspective Cultivation - ACL'23, 2023. [All Versions]. Scientific research is inherently shaped by its authors’ perspectives, influenced by various factors such as their personality, community, or society. Junior researchers often face challenges in identifying the perspectives reflected in the existing literature and struggle to develop their own viewpoints. To address the problem, this paper introduces PersLEARN, a tool designed to facilitate the cultivation of scientific perspectives, starting from a basic seed idea and progressing to a well-articulated framework.

Democratization of Science

Laboratory Automation

  • Reconfigurable system for automated optimization of diverse chemical reactions - Science, 2018. [All Versions]. [Preprint]. This paper describes a plug-and-play, continuous-flow chemical synthesis system that mitigates this challenge with an integrated combination of hardware, software, and analytics. The system software controls the user-selected reagents and unit operations (reactors and separators), processes reaction analytics (high-performance liquid chromatography, mass spectrometry, vibrational spectroscopy), and conducts automated optimizations.

  • Organic synthesis in a modular robotic system driven by a chemical programming language - Science, 2019. [All Versions]. [Preprint]. [Perspective: Democratizing synthesis by automation]. This paper develops an autonomous compiler and robotic laboratory platform to synthesize organic compounds on the basis of standardized methods descriptions. The platform comprises conventional equipment such as round-bottom flasks, separatory funnels, and a rotary evaporator to maximize its compatibility with extant literature. The authors showcase the system with short syntheses of three common pharmaceuticals that proceeded comparably to manual synthesis.

  • A universal system for digitization and automatic execution of the chemical synthesis literature - Science, 2020. [All Versions]. [Preprint]. [XDL Documentation]. [XDL Schema Database]. This paper reports a software platform that uses natural language processing to translate the organic chemistry literature directly into editable code, which in turn can be compiled to drive automated synthesis of the compound in the laboratory.

  • Digitization and validation of a chemical synthesis literature database in the ChemPU - Science, 2022. [All Versions]. [Preprint]. This paper presents an automatically executable chemical reaction database of 100 molecules representative of the range of reactions found in contemporary organic synthesis. The chemical reaction codes or χDLs for the reactions have been stored in a database for version control, validation, collaboration, and data mining. Of these syntheses, more than 50 entries from the database have been downloaded and robotically run in seven modular chemputers with yields and purities comparable to those achieved by an expert chemist.

  • Chemputation and the Standardization of Chemical Informatics - Journal of the American Chemical Society (Au), 2021. [All Versions]. This paper describes a standard hardware (the chemical processing programming architecture --- the ChemPU) to encompass all chemical synthesis, an approach which unifies all chemistry automation strategies, from solid-phase peptide synthesis, to HTE flow chemistry platforms, while at the same time establishing a publication standard so that researchers can exchange chemical code (χDL) to ensure reproducibility and interoperability.

  • Convergence of multiple synthetic paradigms in a universally programmable chemical synthesis machine - Nature Chemistry, 2020. [All Versions]. [Preprint]. This paper shows how the Chemputer synthesis robot can be programmed to perform many different reactions, including solid-phase peptide synthesis, iterative cross-coupling and accessing reactive, unstable diazirines in a single, unified system with high yields and purity.

  • An autonomous portable platform for universal chemical synthesis - Nature Chemistry, 2022. [All Versions]. [Preprint]. This paper presents a portable suitcase-sized chemical synthesis platform containing all the modules required for synthesis and purification. The system uses a chemical programming language coupled to a digital reactor generator to produce reactors and executable protocols based on text-based literature syntheses. Simultaneously, the platform generates a reaction pressure fingerprint, used to monitor processes within the reactors and remotely perform a protocol quality control.

  • An integrated self-optimizing programmable chemical synthesis and reaction engine - Nature Communications, 2024. [All Versions]. This paper presents a dynamically programmable system capable of making, optimizing, and discovering new molecules which utilizes seven sensors that continuously monitor the reaction. By developing a dynamic programming language, the work demonstrates the 10-fold scale-up of a highly exothermic oxidation reaction, end point detection, as well as detecting critical hardware failures.

  • A mobile robotic chemist - Nature, 2020. [All Versions]. [Preprint]. This work uses a mobile robot to search for improved photocatalysts for hydrogen production from water. The robot operated autonomously over eight days, performing 688 experiments within a ten-variable experimental space, driven by a batched Bayesian search algorithm. This autonomous search identified photocatalyst mixtures that were six times more active than the initial formulations, selecting beneficial components and deselecting negative ones.

  • An autonomous laboratory for the accelerated synthesis of novel materials - Nature, 2023. [All Versions]. This paper introduces the A-Lab, an autonomous laboratory for the solid-state synthesis of inorganic powders. This platform uses computations, historical data from the literature, machine learning (ML) and active learning to plan and interpret the outcomes of experiments performed using robotics. Over 17 days of continuous operation, the A-Lab realized 41 novel compounds from a set of 58 targets including a variety of oxides and phosphates that were identified using large-scale ab initio phase-stability data from the Materials Project and Google DeepMind.

  • The Internet of Things comes to the lab - Nature, 2017. [All Versions]. The emergence of connected instruments and equipment promises to untether researchers from the laboratory --- letting them fine-tune experiments and analyse data remotely.

  • A dynamic knowledge graph approach to distributed self-driving laboratories - Nature Communications, 2024. [All Versions]. This work employs ontologies to capture data and material flows in design-make-test-analyse cycles, utilising autonomous agents as executable knowledge components to carry out the experimentation workflow. Data provenance is recorded to ensure its findability, accessibility, interoperability, and reusability. The architecture is built upon the World Avatar project, which seeks to create an all-encompassing digital twin based on a dynamic knowledge graph.

  • Balancing act: when to flex and when to stay fixed - Trends in Chemistry, 2023. [All Versions]. This perspective article provides essential insights into the decision-making process for choosing automation platforms, highlighting the suitability of fixed automation for standardized tasks and the strategic use of flexible automation in dynamic research settings.

  • What is a minimal working example for a self-driving laboratory? - Matter, 2022. [All Versions]. This paper proposes SDL-Demo: a low-cost “Hello, World!” for self-driving laboratories that combines “Hello, World!” tasks from electronics, physics-based simulations, and optimization. SDL-Demo is modular and extensible, making it an ideal candidate for low-cost teaching and prototyping of self-driving laboratory concepts.

  • Robotic search for optimal cell culture in regenerative medicine - eLife, 2022. [All Versions]. This paper develops a robotic AI system with a batch Bayesian optimization algorithm that autonomously induces the differentiation of induced pluripotent stem cell-derived retinal pigment epithelial (iPSC-RPE) cells. From 200 million possible parameter combinations, the system performed cell culture in 143 different conditions in 111 days, resulting in 88% better iPSC-RPE production than that obtained by the pre-optimized culture in terms of the pigmentation scores.

  • Cell Culture: Implementing robotics and artificial intelligence - eLife, 2022. [All Versions].

AI Assisted Research

Theory of Mind

Analogy

Causality

Commonsense

Intuitive Physics

AI Commonsense Reasoning

Commonsense Knowledgebase

Inductive Logic & Program Synthesis

Knowledge Representation

Cognitive Development

Learning in the Open World

Learning with Cognitive Plausibility

Academic Tools

Courses

Programming

  • Probabilistic Models of Cognition - MIT. The probabilistic approach to cognitive science, which models learning and reasoning as inference in complex probabilistic models.

Paper Writing

  • LaTex Configuration - LaTex. LaTex template for configuration file with elegant reference style (gray-colored reference, page backward reference).

  • BibTex Template - BibTex. BibTex template for including abbreviations of journals and conferences in AI, Mathematics, and Cognitive Sciences.

  • bioRender - bioRender. Create professional science figures in minutes by browsing thousands of pre-made icons and templates from more than 30 fields of life sciences.

  • How to construct a Nature summary paragraph - Nature. Nature official guidelines for composing abstracts.

  • How to write a superb literature review - Nature, 2020. Nature speaks to old hands and first timers about the work they did to make their reviews sing.

  • Scientific Papers - Nature. Nature guidance on writing scientific papers.

  • The Machine Learning Reproducibility Checklist - McGill University. Guidelines for introducing a machine learning algorithm with guarantee of reproducibility.

Paper Reading

Literature Management

Knowledge Management

Institute & Researcher

MIT

Stanford

Princeton

Harvard

UCLA

UC Berkeley

BNU

PKU

UCSD

NYU

JHU

SIT

People & Book

John Hopcroft

Theoretical computer scientist.

Ulf Grenander

Applied mathematician, the founder of General Pattern Theory.

David Marr

Computational Cognitive Neuroscientist, the establisher of the Levels of Analysis.

Michael Tomasello

Cognitive scientist, set up the foundations of studying human communications.

Judea Pearl

Applied mathematician, proposed causal intervention on siamese bayesian networks.

Susan Carey

Developmental psychologist, proposed object as a core knowledge of human intelligence.

Daniel Kahneman

Computational cognitive scientist and Economist, set up the foundations for Decision Theory.

Karl Popper

Scientific philosophor, the founder of scientific verification theories.

About

The initiator of this repo has been struggling to taxonomize related topics, since there are so many perspectives to follow, such as task-oriented, technique-oriented, and metaphysics-oriented. Finally he decided to focus on the perspective of The Sciences of Intelligence---each topic describes a phenomenon of intelligence, or an intelligent behavior---they show the objectives of reverse-engineering human intelligence for computational methods. These topics are never restricted to specific technical methods or tasks, but are trying to organize the nature of intelligence---from both the software perspective and the hardware perspective.

Obviously, this reading list is far from covering the every aspect of AGI and CoCoSci. Since the list is a by-product of the literature reviews when the initiator is working on Abduction and Bayesian modeling, other topics are also collected with biases, more or less. Abduction may be the way humans explain the world with the known, and discover the unknown, requiring much more investigations into its computational basis, cognitive underpinnings, and applications to AI. Please feel free to reach out!