This post will focus upon introducing the concepts of AI algorithms, and is summerized from "Artificial Intelligence, A Modern Approach" – by Norvig and Russell - 3rd edition - http://aima.cs.berkeley.edu/ .
- General Definition: study of agents that receive percepts from the environment and perform actions
- Weak AI hypothesis - assertion that software could simulate thought-like behavior - Can machines think – is considered an ill-defined question (does a submarine swim?). Strong AI hypothesis - assertion that software could actually think - most AI researchers are not concerned with this.
- Current Goals: problem solving, planning, reasoning about knowledge, reasoning under uncertainty, decision making, learning, prediction
- Future Goals: problem classification , Attention, creativity, emotion, philosophy, NLP, object recognition, consciousness?
- Encompasses logic, probability, statistics, calculus, perception , cognition, reasoning, control theory, game theory, neuroscience, economics
- Choice of data structures (representations) & algorithms constrained by environment: observability, determinism
- Problem solving - decide what to do x steps ahead - e.g. Searching world states, belief states (set of possible worlds), constraint satisfaction
- Reasoning & knowledge representation – represent perceived world states (& beliefs of states), predicted action effects. Representation languages ( e.g. propositional logic, 1st order logic), inference algorithms à reasoning about knowledge
- Planning - decide what steps to take based on problem and inference
- Reasoning under Uncertainty - Bayesian networks, variable elimination, MCMC, uncertain temporal reasoning, HMMs, Kalman filters, DBNs, game theory, mechanism design
- Learning - generation of knowledge required for decision making. Integrate statistical, symbolic & neural learning. Boosting algorithms, the EM algorithm, instance-based learning , kernel methods, SVMs
- IO - Communicating , percepting, acting - perceive an environment to build KR via sensor input (e.g. OR, NLP, monitoring)
- NLP - language processing , discourse processing , grammar induction, probabilistic language models, information retrieval , machine translation , object recognition
- Agent paradigm is a unifier of disparate subfields. Identical concept to control theory controller
- The problems to which AI poses solutions --> the 1st step must be to specify the task environment as fully as possible
- PEAS analysis - perf measure, environment, actuators, sensors. Note: some goals may conflict so tradeoffs may be required
- Eg automated taxi (an extremely open-ended task) - 1) perf measure: getting to correct destination, minimizing fuel consumption & wear & tear, minimizing trip time & cost, minimizing traffic laws & disturbances to other drivers, maximizing safety & passenger comfort , maximizing profits. 2) environment: roads, other traffic, pedestrians, customers. 3) Actuators: steering, accelerator, brake, signal, horn, display. 4) Sensors: cameras, sonar, speedometer, GPS, odometer, accelerometer, engine sensors, keyboard
- Task environment properties (dimensions) determine the appropriate agent design & the applicability of specific agent implementation techniques - IMHO: is there a system that can automatically assess the type of environment? Not always cut & dried. The hardest case is partially observable, stochastic , sequential, dynamic , continuous, multi-agent
- Fully observable vs partially observable - whether sensors have access to the complete state of the environment at each point in time relevant to the choice of action (as determined by the perf measure). Fully observable environments are convenient because the agent need not maintain any internal state to track the environment. An environment might be partially observable because of noisy & inaccurate sensors or because parts of the state are simply missing from the sensory data - eg vacuum agent cannot tell if adjacent squares are dirty, taxi agent cant read other driver's minds. If an agent has no sensors then an environment is unobservable – however using belief state instead of environment state, goals are still achievable.
- Deterministic vs stochastic - environment is deterministic if the next state of the environment is completely determined by the current state and the actions executed by the agent. If the environment is deterministic except for actions of other agents then it is strategic. An agent does not need to worry about uncertainty in a fully observable, deterministic environment. A partially observable complex environment could appear stochastic (from the agent's perspective) because it is difficult to track all unobserved aspects. Eg Taxi world is stochastic while vacuum world is deterministic (although can include stochastic elements such as randomly appearing dirt or an unreliable suction mechanism). If an environment is not fully observable or not deterministic then it is uncertain.
- Episodic vs. sequential - in episodic task environments, the agent's experience is divided into atomic episodes, where each episode consists of an agent perceiving & then performing a single action - the next episode doesn't depend on the actions taken in previous episodes (ie regardless of previous decisions - eg an assembly line). They are simpler, as there is no need to think ahead. Chess & driving are sequential because actions have consequences that affect the environment. See Markov property.
- Static vs. dynamic - dynamic environments can change while the agent is deliberating. More difficult as the agent must worry about the passage of time & continue looking at the world whilst deciding on an action - if no decision is made, then it counts as deciding to do nothing - eg driving. Semidynamic - eg clocked chess. Static - crossword.
- Discrete state vs. continuous state - the way time is handled according to agent's percepts & actions. A discrete state environment has a finite number of distinct states & a discrete set of percepts & actions (eg chess). Driving is continuous (in terms of speed, location , percepts & actions)
- Known vs unknown – agent's state of context knowledge – the laws of physics of the environment. In a known environment, action outcomes / outcome probabilities are given. In an unknown environment, laws must be learned.
- Single-agent vs. multi-agent - eg playing a 2 player game - does an agent treat other agents as agents or stochastically behaved objects - dependent on whether the behavior of other agents maximizes their performance measure based on the original agent's behavior. Multi-agent environments can be competitive (where the maximization of 1 agent's perf measure minimizes the perf measures of other agents - Eg in chess) or cooperative. Multi-agent environment design considerations include communication & stochastic behavior as rational because it avoids the pitfalls of predictability
- Environment class - AI experiments are not carried out in just a single rigid environment, but in a set of parameterized environments. A rational agent maximizes its average perf over the environment class
- Environment simulator - places agents in an environment, observes their behavior over time & evaluates them according to a given perf measure
- Environment generator - selects specific environments from an environment class (with certain likelihoods) in which to run the agent
State Representation data-types
- atomic (black box states) – no internal structure, only discernable feature is unique identity. Relevant algorithms: problem search, adversarial search, HMMs, Markov decision processes
- factored (states are vectors / tuples of attribute value pairs – values can be Boolean, real or symbolic). Can represent uncertainty via nullable values, reason about state transitions. Relevant algorithms: propositional logic, classical planning , DBNs, machine learning
- structured (states are relational objects, with their own attributes). Relevant algorithms: relational DB, FOL, KB learning, natural language understanding
- note: in terms of increased expressiveness, conciseness, but also complexity
Foundations of AI
- formal rules of logic to infer valid conclusions from initial premises - mechanical generation of conclusions
- Philosophy of mind – mind-body problem: Dualism - a mental mind arises from a physical brain. Monism (Physicalism) – mental states are physical states – allows for possibility of strong AI.
- Causality - connection between knowledge & action. Actions are justified by a logical connection between goals & knowledge of the action's outcome --> an agent is as rational as its actions are justifiable.
- principal of induction – (generalization) general rules are acquired by exposure to repeated associations between their elements
- logical positivism – (Empiricism (evidence) + Rationalism (axiomatic reasoning)) all knowledge can be characterized by logical theories connected to observation sentences that correspond to sensory inputs - all meaningful statements can be verified / falsified by analyzing the meaning of the words (semantics) or by carrying out experiments - rules out metaphysics!
- Confirmation theory - the 1st theory of mind as a computational process - defined an explicit computational procedure for extracting knowledge from elementary experiences
- Computationalism - A machine has Conciousness when it is aware of its own mental states & actions, is phenomenonal when it feels emotions, and has intentionality when its purported beliefs, desires, & other representations are about the real world. Note: it is polite convention that people think when they process mental states. 2 perspectives:
- 1) Functionalism - the Philosophy of mind congruent with AI - states that a mental state is an intermediate causal condition between input & output. Any two systems with isomorphic causal processes are considered to have the same mental states, therefore an application could have the same mental states as a person. Assumption: there is a level of abstraction that is substrate-agnostic: below which the specific implementation doesn't matter - if the processes are isomorphic down to this level, then the same mental states will occur. Critics claim that it does not account for qualia or intentionality. Response: Eliminative Materialism (Rorty, Churchland, Dennet) - philosophy of mind that rejects folk philosophy / commonsense ideas about the mind and takes a purely scientific approach. Consciousness is epiphenomenal – something that happens, but casts no shadow
- 2) Biological Naturalism -opposed to Functionalism. states that mental states are high level emergent features that are caused by unspecified low level substrate-specific neurological processes, & that mental processes cannot be duplicated. Lookup tables are not conscious.
- strong AI Mind-body problem - asks how mental states & processes relate to physical brain states & processes. 2 views: 1) Dualism (Plato, Descartes) - soul & body are 2 distinct entities, 2) Monism/Materialism - (Analytical philosophy, John Searle) - all mental states are physical brain states & brains cause minds. Materialism faces 2 issues: 1) problem of Free Will - how can a physical mind governed by the laws of physics retain freedom of choice requires an understanding of free will. 2) Consciousness - understanding & self-awareness - to understand why certain brain states have associated feeling, need to go beyond the physical brain (it continuously replaces its atoms via metabolic processes without affecting mental states). Brain state types theoretically allow categorization of brain states.
- analytical philosophy: Identity theory - take on materialism - asserts that mental states are identical to physical brain states. varients: 1) functional specification theory - relates mental & physical states based on their functional role. 2) functional state identification theory - identifies mental states with abstract computational brain states independent of the specific composition of the brain - a form of dualism.
- Leap of philosophical ideas to a formal science required mathematical formalization in 3 areas: logic, computation, probability
- Logic - Formal rules for drawing valid conclusions & algorithms to compute them. Boolean (propositional) logic was extended to First order Logic (inclusion of objects & relations). theory of reference - rules for relating logical objects to real-world objects.
- completeness theorem - there exists an effective procedure to prove any true statement in 1st order logic, but 1st order logic cannot capture the principal of mathematical induction required to characterize the natural numbers. Countered by incompleteness theorem.
- The Church-Turing hypothesis – characterize exactly which functions are capable of being computed. a Turing machine is capable of computing any computable function. Also showed that there are some functions which cannot be computed by a Turing machine – Halting Problem
- Probability theory - for dealing with uncertain measures & incomplete theories in a quantitative manner. Bayes rule for updating probabilities in light of new evidence - Bayesian analysis is the basis for AI reasoning under uncertainty
- Numerical analysis – function approximation
- An economy consists of individual rational agents with the goal of maximizing their utility (preferred outcomes: both maximize short & long term payoff) àdecision analysis techniques for competitive & interactive environments
- Decision Theory focuses on expected utility maximization, whilst Game Theory focuses on optimal outcomes for rational agents
- Decision theory - combines probability & utility theory - formal & complete framework for decisions made under uncertainty - probabilistic descriptions generalize the environment, so that an individual agent does not need to pay attention to the actions of other individual agents
- Game theory – rational agents can affect the utility of each other (either positively or negatively) via competition / cooperation. A rational agent must reduce its predictability to adversaries - maximize utility by apparently acting randomly. Rational action selection.
- Optimizations research - scarce resource allocation, scheduling, optimization. Mechanisms for making rational decisions when action payoffs are not immediate but instead result from several actions taken in sequence.
- Markov Decision Processes – dynamic programming
- Satisficing - making decisions that are good enough rather than laboriously calculating an optimal decision
- Quantitative Finance – multi-agent value trading systems that interact in stochastic environments
- ·Cognitive Complexity
- Complexity theory – gestalt emergence, connectionism - structuring knowledge in a way that software can reason with it
- Linguistics - relating language to thought - Understanding sentence syntax , semantics (subject matter & context). NLP - computational linguisticsKnowledge representation & Ontology
- Cognitive psychology - viewed the brain as an information processing device whereby perception involves unconscious logical inference via mental constructs - concepts. a knowledge based agent: - stimulus is translated into an internal representation which is manipulated by cognitive processes to derive new internal representations which are retranslated back into actions. A cognitive organism carries a small scale model of external reality and its own possible actions --> anticipation: possible to mentally simulate action walkthroughs, conclude the optimal action path, & react to future situations before they arise
- Control theory & Cybernetics
- Autonomous software - Self-regulating systems can modify their behavior in response to environmental changes - stable feedback systems
- steady-states – self-correcting dynamic systems aim to achieve homeostasis
- Control theory - behavior arises from a regulatory mechanism trying to minimize error - the difference between current state & goal state
- Stochastic optimal control (modern control theory) – optimize behavior by maximizing objective functions over time. Utilizes calculus & linear algebra --> mathematical limitations: suitable for systems described by fixed sets of continuous variables where exact analysis is feasible only for linear systems
- AI was in part founded to escape from these mathematical limitations via the tools of logical inference & computation --> allowed consideration of problems such as language , vision & planning
4 approaches to AI
2 axes: thought / behavior (actions), rationality / humanity.
1) Behavior Simulation
o the Turing test (1950) approach - simulate intelligent (i.e. human) behavior
o duplicate an exemplar - evaluate distinguishability from standard intelligent beings (humans). required capabilities:
o A) Communication (NLP) - generate comprehensible sentences for socialization in environment.
o B) Knowledge representation - store input percepts in a representation suitable for inference and decision making
o C) automated reasoning - use information store to answer questions & draw new conclusions
o D) machine learning - detect & extrapolate patterns à adapt to new circumstances - construct a worldview model of effective strategies for dealing with the environment.
o E) computer vision - perceive objects in environment. Visual perception - provides feedback information on action result
o F) actuators - manipulate objects in environment
2) cognitive modeling
o determine how humans think - study the underlying principles of intelligence: 2 methods: 1) introspection ( real-time self-evaluation of thought state); 2) psychological experiments
o Construct a sufficiently precise theory of mind, trace reasoning steps & express it as software
o Cognitive science - construct computer models based on falsifiable theories of mind yielded from psychological experiments.
3) logical laws of rational thought
o use mechanical inferences to reason logically to the conclusion that a given action will achieve one's goals & then act on that conclusion.
o precise notation for non-abstract entities & their relationships (as opposed to Arithmetic which has 3 equivalence relations: =, <, > )
o Emphasis on correct inferences
o 2 obstacles: 1) difficult to take informal or uncertain knowledge and state it in formal terms of logical notation; 2) combinatorial explosion – a small input of facts can exhaust computational resources - requires reasoning prioritization.
o Perfect rationality (always doing the right thing) is not feasible as the computational demands are too high. Instead aim for satisficing limited rationality - acting appropriately when there is not enough time to do all computations
4) Rational agents
o autonomous, perceptive, adaptable, goal modification: act to achieve best possible outcome under uncertainty and limited resources
o Utilizes laws of thought approach when applicable. Use alternative methods in situations where a) there is no provably correct thing to do, yet something must be done; b) inferences are not involved (e.g. fast reflex actions).
o advantages over other approaches: 1) more general than laws of thought approach - uses it as one of its toolsets; 2) rationality is completely general & clearly defined - more amenable to scientific development than approaches based upon human behavior / thought (which is the product of complex evolution for a specific environment)
5 possible goals to AI
- 1) perfect rationality: Not realistic - calculations are too time consuming. Such an agent would act at every instant to maximize its expected utility
- 2) calculative rationality: Logical & decision-theoretic agents - approximate perfect rationality. such an agent eventually returns what would have been the rational choice at the beginning of its deliberation. In practice , perf constraints force design compromise.
- 3) bounded rationality: involves satisficing Thresholds- deliberating only long enough to get a good enough answer.
- 4) bound optimality: Specifies optimal programs rather than optimal actions generated by the programs. Agent behaves as good as possible given its computational resources. Best approach – advantage: feasible, applicable. Complex architecture - must optimize its internal organization to its current environment
- 5) Asymptotic bound optimality: Let a program P be BO for a machine M in environment class E whose environments are of unbounded complexity. Then program P' is ABO for M in E if it can outperform P by running on a machine kM (that is k times faster than M). For a nontrivial environment on a nontrivial architecture, an ABO program would satisfice unless k was enormous.
- Computational theory of mind – is thought more than computation?
- Singularitarianism – utopian ideology
- Decidability (decision problem) – existence of an algorithm for determining membership in a set of formulas. deciding the truth of any logical proposition involving the natural numbers - whether there are fundamental limits to the power of effective proof procedures & the formalization of general mathematical reasoning as logical deduction.
- incompleteness theorem - real limits to computation / deduction exist - in any language expressive enough to describe the properties of the natural numbers, there are true statements who's truths cannot be established by any algorithm --> there are some functions that cannot be represented by an algorithm & computed, and no system of axioms can contain its own proof of truthfulness
- Halting Problem - no Turing machine can determine whether a given program will return an answer on a given input or run forever à certain mathematical questions are in principal unanswerable by formal methods.
- Intractability - when the time required to solve instances of a problem grows exponentially with the size of the instances. Exponential growth means that even moderately large instances cannot be solved in any reasonable time --> should strive to divide the overall problem of generating intelligent behavior into tractable sub-problems instead of intractable ones. computational complexity theory: NP-completeness - large classes of canonical combinatorial search & reasoning problems are intractable – input bound: scaling up is not the solution.
- Context Sensitive - cant just rely upon simple syntactic manipulation. Eg language translation - poor Sov translation led to cut of US Gov funding
- P <> NP
- Boolean satisfiability problem: given a sentence of propositional logic, is there an assignment of truth values to the proposition symbols of the sentence that make it true? Unless P=NP there can be no algorithm that solves all satifiability problems in polynomial time.
- Turing's qualification problem: inability to capture the complexity of human behavior as a set of logical rules leads to a computer's inability to generate behavior. No app has come close to fooling 30% of the Turing board - the field as a whole pays little attention to the Turing test or to the Loebner prize.
- Physical symbol system hypothesis - a symbolic system provides the necessary & sufficient means for general intelligent action. Any system exhibiting intelligence must operate by manipulating data structures composed of symbols. GOFAI claims that all intelligent behavior could be captured by a logical agent that reasons via inference from a set of facts & rules describing the domain. This hypothesis has been challenged
- The "brain in a vat" thought experiment - intentional [brain] states- propositional attitudes (states such as believing, knowing, desiring, fearing etc) necessarily refer to some aspect of the real world. This experiment aims to separate intentional states from their external objects. Let a neural interface be the only sensory IO with the real world. The experiment asks if the vat brain experiences (simulates) the same mental states as that of an embodied brain. The wide content view (external omniscient observer where metal states include brain state + environmental history) - views them as different. The narrow content view (internal subjective observer where metal states include only brain state) - views them as identical.
- The brain prosthesis thought experiment - gradually replace all the neurons, & then reverse the process. Would introspection of consciousness be altered – depends on whether consciousness is epiphenomenal (something that happens, but casts no shadow on the real world).
- Searle's chinese room thought experiment – challenges functionalism. a hypothetical system that passes the Turing test, but doesn't understand any of its IO (ie the Cantonese language). States that running the appropriate program is not sufficient condition for being an understanding mind. Searle saw software as formal, syntactic entities , & minds having semantics (ie mental contents) --> syntax is not sufficient for semantics. Required causal features. One could make the same argument about the brain – a neuron does not understand, but consciousness emerges from the whole system, a larger functional unit. Not logically sound: an explicit denial of functionalism does not necessarily conclude that non-brains are minds.
- Consciousness – subjective understanding and self-awareness. There is still an explanatory gap to the mystery of consciousness. Qualia (internal experiences) challenge Functionalism – different representations for isomorphic causal processes. Computational simulation of mental processes – humans do have real minds, while machines might or might not - mind-body problem: minds may be substrate specific to brain
- Current hardware limitations of supercomputers compared to brains - operations /sec: 10^15 vs 10^17. Neural substrate is slower, but massively parallel – approx. 20 Petaflops. Even with theoretical unlimited capacity, we do not have intelligent software , but there is the concept of brute force self-optimization and code re-generation. Moore's law (18 months computing power doubling) mechanism switched in 2005 from focus on clock speed to focus on core count due to power dissipation (electron jumping) issues.
- Computation does well at combinatorial tasks, but not on judgment tasks. machine learning has made very little progress on constructing new representations at levels of abstraction higher than the input vocabulary - concept building eg raise arm & swallow = have a cup of tea.
- Weak AI argument from disability - mental processes required to produce a given behavior can be simulated.
- Weak AI mathematical objection - Godel's incompleteness theorem has 3 counter-arguments: 1) it applies only to formal systems that are powerful enough to do arithmetic (eg Turing machines which are infinite) - it does not apply to computers which are finite and therefore can be described as a very large system in propositional logic, which is not subject to Godel's incompleteness system. 2) there are sentences which can be asserted (truth can be established) by some agents and not by other agents - eg a self-referential sentence about a specific agent's inability to assert the subject sentence - cannot be asserted by the specific agent as it would be a contradiction (note: the sentence must be true, because if it were false, then the specified agent could not consistently assert it). 3) Even if computers have limitations on what they can prove, so do humans! While GIT rigorously demonstrates that a formal system cannot do the un-formalized activity X, it is impossible to prove that informal systems (such as humans & computers) are not subject to GIT, because any rigorous proof would itself contain a formalization of X , & hence refute itself. Just as human reasoning can be inconsistent, so too can mathematical reasoning.
- Weak AI argument from informality of behavior - Hubert Dreyfus criticism of GOFAI does not apply to probabilistic reasoning systems which are more practical for open ended domains. Dreyfus - human expertise includes knowledge of some rules only as a background holistic context within which humans operate - eg thought processes of appropriate social behavior are conducted at a level not open to introspection by the conscious mind. Dreyfus proposed a 5 stage process of acquiring expertise via organizing a learning model for instantaneously selecting correct responses. Good generalization from examples can be achieved by incorporating background prior knowledge into learning algorithms, but this relies upon the availability of knowledge in specific form - supervised (requiring prior identification of relevant inputs & correct outputs). Attention - The mind is able to direct its sensors to seek relevant information and to process it to extract aspects relevant to the current situation - the theories of information value & active vision are likewise used to direct sensors.
Current Frontier (Still working on this)
- background commonsense
- compiled forms of decision making
- embodied cognition - agents situated in an environment – augment reasoning via perception
- High Order Logic
- probabilistic First Order Logic
- Hierarchal Dynamic Bayesian Networks
- Explanation-based learning and new reflex generation
- Real-time AI – in environments when there is never time to solve decision problems exactly. Agents need ways to control their own deliberations - to cease deliberating when action is demanded & use the timeslot to execute the most profitable computations - prioritization - eg evasive action
- Metareasoning -use to design better search algorithms that are "anytime". - 2 new techniques for general decision making: 1) Anytime algorithms - Dean & Boddy, Horvitz, Zilberstein & Russell - output quality improves gradually over time, so that it has a reasonable decision ready whenever it is interrupted - controlled by a meta-level decision procedure that assesses whether further computation is worthwhile. Eg Iterative deepening search in game playing. 2) decision-theoretic metareasoning - Horvitz , Russell & Wefald, Breese - applies the theory of information value (ch16) to the selection of computation based on its cost (in terms of delaying action) & its benefits (in terms of improved decision quality). Meta-reasoning is expensive - use compilation to keep the overhead small compared to the costs of computations being controlled
- Reflective architecture - enables introspective deliberation about the computational entities & actions of the actual deliberation. Requires construction of a joint state space composed from environment state & computation agent state. Eventually task specific algorithms (eg alpha-beta search, backward chaining) will be replaced by generic methods that direct the agent's computation towards the efficient generation of high quality decisions
- Apprenticeship learning – specify utility function by example