Chapter 1 Artificial Intelligence
Artificial Intelligence technologies (AI) are increasingly becoming part of our daily life. They have become commonly used in academia and the business world alike, in a large range of fields, allowing the application of notions from both natural and social sciences. While AI is based on mathematics and computer science, its recent successes derive from a combination of three fundamental factors: the increasing availability of large amounts of data, growing computing power, and new and more efficient algorithms.
1.1 What is AI?
Although there are many different definitions of what exactly is the aim of AI studies1, essentially the development of AI systems focuses on building machines capable of performing tasks that typically require human intelligence. On a more practical level, when we speak of modern AI systems we generally refer to prediction machines. AI systems have no cognitive state: (Searle 1990) they simply are machines that, given some inputs, return some more or less determined outputs. We generally speak of AI because these machines are capable of returning accurate answers even when provided limited and partial information, thus predicting. AI systems exploit previously collected data to create a statistical model of the world capable of predicting an output corresponding to our expectations with a reasonable amount of accuracy. In this framework, intelligence (in its general meaning) is not involved at all.
In this context, some experts distinguish between Artificial Narrow Intelligence (ANI) and Artificial General Intelligence (AGI). ANI, or weak AI, involves AI systems specialized in performing a particular activity, such as playing chess, recognizing images, or transcribing a speech. As we are all aware, the diffusion of such systems has rapidly made these instrument parts of the essential toolkits of our daily life, automating repetitive tasks and increasing efficiency.
On the other hand, with AGI, or strong AI, scholars identify a machine capable to autonomously perform any intellectual task that a human can perform, in other words, an artificially intelligent system capable of achieving a broad and general objective, using a model of reality to make predictions and plan actions (Flowers 2019). Many scholars, above all Searle (1990) and Dreyfus (1972), think that this objective is not achievable, nor desirable.2
1.2 An economic perspective of AI
Since modern AI technologies represent a drastic improvement of prediction technology (Agrawal, Gans, and Goldfarb 2018b), let us consider the role of prediction in the decision-making process. According to Agrawal, Gans, and Goldfarb (2018b), the decision making process can be split into the stages of prediction and judgment, where prediction is the act of mapping the possible outcomes of a variety of actions and judgment is the act of choosing the most well-suited path to reach an objective (Agrawal, Gans, and Goldfarb 2018a).
In economic terms, given a predetermined state \(S\), prediction consists of mapping all the set of actions \(X\) that can undertake, and their respective outcomes. Judgment, on the other hand, consists of the process of evaluating the outcome obtained from choosing an action \(\chi\) from a set of possible actions \(X\). The payoff for each action can be defined as a utility function \(\mu\) (\(\chi\), \(\theta\)), where \(\theta\) is the realization of an uncertain state drawn from a distribution F(\(\theta\)) (Agrawal, Gans, and Goldfarb 2018a). The decision-making process is thus reduced to a “standard problem of choice under uncertainty” (Agrawal, Gans, and Goldfarb 2018a) where prediction maps the likelihood of possible outputs and judgment ranks the desirability of each output. Since it is assumed that utility can only be determined by humans, who have to undertake a costly process that allows the mapping from (\(\chi\), \(\theta\)) to a specific payoff value \(\mu\), the utility function is unknown to the AI. Evaluation requires time and effort, and it is strictly linked to the operator’s final goal, which is not easily translatable to a machine.3
AI technologies thus have substantially reduced the costs associated with prediction. However, while in some cases this has led to the full automation of simple tasks, in others is unlikely that this will lead to substitution of human users, rather to the enhancement of human capabilities. In economic terms, this equals asking if judgment and prediction are complements or substitutes. Agrawal, Gans, and Goldfarb (2018a) created an model to determine whether prediction can reach a level of accuracy able to entirely take over the judgment part, concluding that they are complements, provided that judgment is not too difficult. When complexity is introduced, “the impact of improved prediction on the value of judgment depends on whether improved prediction leads to automated decision making” (Agrawal, Gans, and Goldfarb 2018a). As complexity increases, there is a tendency for humans to opt for default decisions and heuristics that, on average, perform well. This aspect reduces the economic value of judgment and increases the one of prediction at the margin. Agrawal, Gans, and Goldfarb (2018a) analyzed complex problems such as automation, contracting, and firm boundaries. They suggest that the possibility of introducing automated decision-making through prediction largely depends on the specific features of the tasks, with the value of judgment increasing when the cost of mistakes is particularly high, as in the case of applications of AI to tasks that have a great impact of people’s life, such as the automatic detection of tax fraud.
1.3 A Brief History of AI
As a formal science, AI was born less than 80 years ago, drawing from a variety of different fields, such as philosophy, mathematics, linguistics, psychology, economics, and many others (Russell and Norvig 2010). Providing a complete review of the topic goes beyond the scope of the dissertation, therefore in the following pages, I will only present a brief summary of the milestones of modern AI.
1.3.1 Early developments: 1940s – 1950s
In 1943, McCulloch and Pitts published the first work that is now generally considered AI (McCulloch and Pitts 1943). By combining the basic physiology of neurons, the formal analysis of propositional logic, and Turing’s theory of computation, they demonstrated that networks of connected neurons could compute any computable function. This work is notably the forerunner of both the logicist and the connectionist tradition of AI. Subsequently, other scholars started working on these concepts, and in 1951 Minsky and Edmonds built the first neural network computer (Russell and Norvig 2010). In 1950, Alan Turing published the seminal paper “Computing Machinery and Intelligence”, where he describes the process of creating intelligent machines and proposes a method of testing machine intelligence through a “game of imitation”, where a human should be able to distinguish a human from an AI in a teletype dialogue4. In 1956, John McCarthy organized a workshop at Dartmouth College that brought together all the significant figures of the growing AI field. While the workshop itself did not lead to any notable discovery, it made all these researchers know each other, fundamentally contributing to the evolution of the AI field (Russell and Norvig 2010).
1.3.2 High expectations and the first AI winter: 1960s – 1970s
In the following years, the AI field experienced the first successes, such as the creation of the General Problem Solver (GPS)5. These early programs often contained little or no knowledge of their subject matter and often succeeded in using simple syntactic manipulations (as in the firsts machine translation efforts). It was a general belief that solving more complex problems was only a matter of faster hardware and larger memories, an assumption proven wrong afterwards (Russell and Norvig 2010). This progressively led to what has been described as the first AI winter. The high expectations that the public had for AI gradually faded away, and in 1974, criticism by Sir James Lighthill on AI technologies added to the pressure from public research institutions to fund more productive projects, leading to the cut of AI exploratory research by the US and British governments (Gonsalves 2019).
1.3.3 Expert systems 1980s – 1990s
The development of the first microprocessors at the end of the 1970s renewed interest in AI research, leading to the development of the DENDRAL program, an expert system capable of inferring the molecular structure based on the information provided by a mass spectrometer (Nilsson 2009). Its expertise derived from large numbers of special-purpose rules collected through the extensive interviewing of domain experts. In the 1980s, expert systems gained commercial success and, by 1985, the AI market had reached a value of over a million dollars (Russell and Norvig 2010). Moreover, when Japan launched the Fifth Generation Computer Project, US and UK researchers feared that Japan could establish supremacy in the field and managed to generate a similar investment in the US (Ray 1998). At the end of the 1980s, the first limitations of expert systems emerged. Their programming was extremely complex, and as the number of rules increased, a “black box” effect made the functioning of the machine hard to understand. Development and maintenance were challenging, and therefore faster, easier, and less expensive ways to achieve the same results progressively took over (Russell and Norvig 2010). The last success of expert system was Deep Blue, IBM’s AI that in 1997 beat the world chess champion Garry Kasparov. Nevertheless, as all expert system, it was based on a brute force algorithm that evaluated each move based on a set of rules extracted from chess books and previous grandmaster games (Campbell, Hoane, and Hsu 2002), and marked the abandonment of rule-based AI systems. Even a relatively simple task such as playing chess required setting a complex set of parameters, limiting their applicability to the vast complexity of the world (Nilsson 2009).
1.3.4 Data + Computer power = Machine Learning 2000s – 2010s
In the late 1990s and early 2000s, increased computational power and data availability gave new impetus to AI research. A focus on the resolution of closed and specific problems and new ties between AI and other fields led to increased use of AI for logistics, data mining, and medical diagnosis: AI entered into the mainstream technological use (Haenlein and Kaplan 2019).
The development of the internet and the dramatic increase in data availability drastically improved machine learning algorithms’ performance. Unlike expert systems, where the programmer had to explicitly tell the machine how to act in individual situations, machine learning systems build a mathematical model based on input data. While the first learning systems were developed in the early 1960s, their performance improved only after large sets of training data became available (Nilsson 2009). Machine learning is not about writing a set of instructions to perform a specific task but rather letting the machine determine the best prediction model for a particular problem, based on massive amounts of data and a limited number human-set parameters. Quality training sets became more important, marking a paradigm shift from the previous approach that focused on algorithms optimization (Russell and Norvig 2010). The larger and more complex the training set is, the more computing power is required. Computer graphics card processors (called GPUs, capable of performing multiple parallel computing) allowed to build affordable ML models and, therefore, increased AI researchers’ data processing capabilities (Lee, Tsung, and Wu 2018). In the past years, machine learning has been used for performing more complex tasks, within the framework of narrow and clearly defined problems, such as board games, text classification, and image recognition. In 2011, Watson, the question-answering AI system powered by IBM, repeatedly won against two “Jeopardy!” champions (Ferrucci 2012), while in 2012 Google X finally managed to have an AI that can recognize various objects (such as cats and dogs) in videos without any explicit programming and, in 2016, AlphaGO beat the world champion of the Chinese game Go (Borowiec 2016). More recently, deep learning (a subfield of machine learning based on multiple layers of artificial neural networks) has increased in popularity among AI engineers (Goodfellow et al. 2016). Deep learning is now commonly used for machine translation, image recognition, drug design, and many other fields, but it still presents some limitations, such as a “black box” effect that prevents human engineers from understanding how and why a final model does or does not work.
1.4 Two different approaches to AI: symbolism and connectionism
Since the beginning of the formal study of AI it is possible to discern two different schools of thought which influenced how the AI field developed: Symbolism and Connectionism.
The symbolist tradition of AI is based on the idea that intelligence can be achieved through the manipulation of symbols6 (Newell and Simon 2007), where symbols are a high-level abstraction of objects, problems and logic. They provide the AI with an already codified representation of knowledge, using a top-down approach. A typical example of such systems are expert systems, which emulate a human expert’s decision-making ability by codifying it in a series of if-then rules by drawing from a previously acquired knowledge base7. These systems present the advantage that once the basic shell is developed, it is possible to instill knowledge through a format that is easily comprehensible to domain experts, reducing the cost of intervention of IT experts. Additionally, it is possible to obtain a prototype in a relatively short amount of time. However, the process of knowledge acquisition to achieve a valuable product may result in being extremely long (Greene 1987), and as the size of the knowledge base increases, the harder it becomes to verify the consistency of the decisions, making prioritization and ambiguities common challenges of sizable expert systems. Finally, when it becomes necessary to update the knowledge base, the risk of disrupting the system persists, since every change in if-clause statements might potentially endanger the whole system (Partridge 1987). Some scholars even affirmed that expert systems that are not able to learn should not even be considered AI (Schank 1983). While many agree with this proposition, they represented a milestone in the development of the field, and are generally considered as AI for historical reasons (Partridge 1987).
The connectionist (or sub-symbolist) tradition adopts another approach, affirming that to achieve intelligence AI systems need to mimic the functioning of the human brain rather than just manipulating abstractions. In the words of Searle (1990): “a computer has syntax but not semantics.” Sub-symbolic systems have a bottom-up approach, starting from the lowest level and building intelligence by connecting the dots. Machine learning (ML) is an example of such technologies, which allows AI systems to learn from examples without hard-coding the instructions, but rather adjusting a set of parameters in order to achieve a desirable outcome. ML optimizes a statistical model defined up to some parameters based on example data or past experiences, also called training data (Alpaydin 2004). The model can be either predictive, descriptive, or both. Assuming that the data is accurate and the model was constructed correctly, ML systems are able to detect patterns and schemes with acceptable approximation, sometimes even outperforming human experts (Grace et al. 2018). Performance greatly relies on the training set: the more examples a model is given during training, the more information it has when it comes to making accurate predictions and correctly classifying unseen cases (Russell and Norvig 2010), (Jordan and Mitchell 2015).
The main techniques used to build ML models can be classified in supervised, unsupervised, and reinforcement learning (Alpaydin 2004). In supervised learning, the training data is labeled, meaning that a sample of input vectors (inputs) is coupled with a sample of corresponding target vectors (outputs).8 Unsupervised learning instead focuses on identifying patterns in the training set where no labeling or feedback is provided (Dey 2016). It is mainly used in cluster analysis, detecting “potentially useful clusters of input examples” (Russell and Norvig 2010). The underlying assumption is that the structural properties (algebraic, combinatorial, or probabilistic) of data might provide insights essential to the prediction (Jordan and Mitchell 2015). Finally, reinforcement learning maximizes a payoff function based on the actions or decisions that the AI system takes. labeled training examples are provided to indicate whether a decision in an unknown dynamic environment is correct or not for a given input (Jordan and Mitchell 2015).9. While this classification is helpful to understand the dynamics underlying machine learning, in most real-life applications, AI models are often trained using mixed strategies (generally referred to as semi-supervised learning).
Recently, Deep Learning, a ML technique that uses multiple layers of artificial neural networks, has proven effective in solving many complex problems. An artificial neural network is a collection of connected artificial neurons that mimic the behavior of biological neurons. An artificial neuron is a simple processing unit designed to be trained from data (through unsupervised induction) (Henderson 2010). In an artificial neural network, each neuron receives an incoming signal and transmits it to the neurons it is connected to. Typically, neurons are aggregated into layers that may perform different transformations on their inputs (Russell and Norvig 2010). These kinds of machine learning structures have been applied to many fields such as computer vision, speech recognition, natural language processing, audio recognition, and board game programs, where they reached results that could be compared to human experts and that, in some cases, outperformed them (Pouyanfar et al. 2018). Unfortunately, training this kind of structure is very complex, and deep learning algorithms often operate as “black boxes”, since their complexity makes it difficult to ML engineers to understand their functioning (Castelvecchi 2016).
1.5 Some AI capabilities
Often, media and popular culture depicts AI systems in the form of gigantic supercomputers, which seem far from our human experience. However, reality contradicts these colorful narratives. Different applications (such as search engines, microtargeting advertisement, news aggregators, recommendation systems, speech recognition, and so on so forth), products and services that use AI technology are entering our daily lives, providing alternative solutions to traditional tasks. With no pretense of completeness, I retained useful to provide a non-exhaustive list of different AI capabilities, along the lines of human abilities: seeing, listening, understanding, thinking, and moving.
1.5.1 Seeing: Computer Vision
Computer vision is a field of AI that aims to understand the content of visual inputs. It may involve extracting a description, either in the form of a text, an object, or a three-dimensional model. In other words, “at an abstract level, the goal of computer vision problems is to use the observed image data to infer something about the world” (Prince 2012). Computer vision technologies acquire, process, analyze, and understand visual inputs to extract high-dimensional data from the real world to produce numerical and symbolic information (Rosenfeld 1988).
The main tasks operated by computer vision systems are recognition, motion analysis, scene reconstruction, and image restoration. Image recognition determines whether or not image data contains a specific object, feature, or activity (Forsyth and Ponce 2012). Applications of AI spans from optical character recognition (OCR) to pose estimation and facial recognition (Nilsson 2009). Motion analysis instead processes an image sequence (a video) to estimate the direction and velocity of an object (tracking) or the camera itself (ego-motion). When tracking and ego-motion technologies are combined, they are generally referred to as optical flow (Russell and Norvig 2010). Scene reconstruction deals with the construction of a model of a 3-dimensional space. Its application ranges from crime reconstructions to the digitization of cultural heritage and geospatial mapping (Trucco and Verri 1998). Finally, image reconstruction is used to restore old or damaged visual content, removing noise, and augmenting quality (Banham and Katsaggelos 1997). In this way, it is possible to obtain more precise image data that can later be subjected to human analysis or other computer vision techniques.
1.5.2 Listening: Speech recognition
Similar to image recognition, speech recognition systems transcribe speech to text. Audio signals are cleaned and processed to isolate the frequencies that represent a human voice. Then, with the help of linguistic models, a final text is produced (Ashri 2020). Speech analysis can provide additional information regarding the speaker, such as identity, emotional state, health status, accent, and gender (Nassif et al. 2019). Speech-to-text technology represents a crucial asset for transforming orally recorded data into text data that can later be analyzed using Natural Language Processing (NLP) techniques, further diminishing information retrieval transaction cost. Unfortunately, the performance of speech-to-text technologies is subjected to high variance depending on the settings in which the recording takes place. Various studies show how even when dealing with systems that have a high-quality output in controlled environments, if disturbing elements10 are introduced, their performance goes below the human benchmark.11
1.5.3 Understanding: Natural Language Processing
Natural Language Processing (NLP) is a field of AI concerned with the interactions between computers and human (natural) languages, particularly with how to program computers to process and analyze large amounts of natural language data (Chowdhury 2020). In the first days of NLP, the typical approach was based on expert systems, where grammatical features and heuristics were hard-coded (Nilsson 2009). With the introduction of machine learning techniques the performance of linguistic models drastically improved. NLP is now used in fields such as “machine translation, natural language text processing, and summarization, user interfaces, multilingual and cross-language information retrieval (CLIR), speech recognition, artificial intelligence, and expert systems, and so on” (Chowdhury 2020).
Whoever attempts to build software capable of extracting and manipulating natural language generally encounters three significant issues: the extraction of the thought process, the storage of the representation and meaning of the linguistic output, and the contextualization of information (Chowdhury 2020). As a consequences, approaches may vary: some are based on lexical and morphological analysis (Part-Of-Speech tagging), others on semantic and discourse analysis (Semantic Role Labeling, chunking), and knowledge-based approaches (Named Entity Recognition). On a more practical level, NLP systems generally comprehends a combination of all of these, starting from the word level, extending it to the sentence level, and finally framing it in the context of the specific domain (Chowdhury 2020).
Some typical NLP applications that are currently being investigated are concept extraction, machine translation, question answering, and natural language generation (Clark, Fox, and Lappin 2013). Among those, concept extraction is the most problematic: while in some cases NLP systems showed promising results in very restricted domains, we do not know yet how to correctly compute the meaning of a sentence based on words and context (Chai et al. 2001). Machine translation is one of the earliest applications of NLP, but it still presents several complex problems since human language is ambiguous and is characterized by a large number of exceptions. Nevertheless, recently machine translation has reached a stage that allows people to enjoy its benefits (Chowdhury 2020). Even if it is not always perfect and the translations are not as good as humans’, the initial results are very encouraging, and machine translations may serve as a way for human translators to speed up the process and improve their performance. Question-answering systems and natural language generators are showing promising results.12
Despite these advances, three challenges prevent NLP from reaching commercial success: scalability, portability and variability. NLP systems establish patterns that are valid only for a specific domain and a particular task. When the topic, context, or user, change, it is necessary to create entirely new patterns. Second, advanced NLP techniques such as concept extraction are too computationally extensive for large scale NLP applications (Sparck Jones 1999). Third, human behavior and communication patterns are erratic and constantly evolving, while the NLP system requires extensive and stable corpora to produce effective results (Chowdhury 2020).
1.5.4 Strategic Thinking: AI Planning
A fundamental functional application of AI system is AI planning, which aims to build systems capable to design the set of actions to perform to achieve a desired goal. Planning involves the introduction to AI systems of concepts such as time, causality, intentions, uncertainty, and multiple agents’ actions. The classical formulation of the planning problem requires three inputs: a description of the state of the world, a description of the agent goal, and a description of the possible actions that can be performed (also called domain theory). The planner then outputs a sequence of actions designed to achieve that goal (Weld 1999).
Recently there has been a peak in the interest in what has been called domain-independent planning (when no input domain is specified). Now machine learning algorithms are used to improve planners’ speed and quality performance, but there are still many challenges. The high uncertainty of the domain makes “completing a plan a significantly more difficult task than computing one” (Leonetti, Iocchi, and Stone 2016). Relevant details and dynamics might be misinterpreted, creating imperfect models. While current systems manage to scale up antecedent problems, planning in uncertain situations still needs further research to reach acceptable results (Jiménez et al. 2012).
1.5.5 Moving: Robotics
While the previous AI applications can entirely be performed in digital environments, robotics involves the interaction of AI systems with the physical world, where uncertainty and external actors increase complexity (Brady 1985). It is important to stress that robotics does not necessarily involve AI: we can distinguish between intelligent and non-intelligent robots. Non-intelligent robots are mechanical devices that perform operations based on instructions that are either hard-coded into their systems or transmitted to them by humans through a telecommunication infrastructure, while an intelligent robot is “a mechanical creature which can function autonomously” (Murphy 2000). While in the early days of AI, robotics was considered an integral part of the field, it progressively diverged over the years. Robotics focused more and more on the manufacturing industry and the assembly line’s automation, which required no intelligence to function. Some contact points remained, but it was relegated to applications where humans could not efficiently communicate with robots in any way, such as in space explorations (Murphy 2000). Recently, advances in both fields have brought renewed interest in closing the gap between them. Robots are increasingly introduced in “less engineered and more open environments. To do so, they need to rely on cognitive capabilities typical of AI, such as knowledge representation, learning, adaptation, and human-robot interaction” (Rajan and Saffiotti 2017). Increase autonomy of robotic systems may prove to be particularly useful in applications where humans are at significant risk (such as in space, military, or health threats) or in trivial, physically harsh, and unpleasant tasks (such as in the service industry or agriculture).
On the other hand, intelligent robots’ introduction raises ethical concerns of a different scale than other AI applications, especially in robot-human interaction. The potential deployment of Lethal Autonomous Weapon Systems (LAWS) has increased concerns in the international community. Still, even apparent innocuous intelligent robots such as autonomous vehicles present ethical dilemmas in contexts where there is neither a human nor time available to provide an answer (e.g. in the context of a car crash). In these cases, the determination of liability is not trivial and there is no shared position in the academic community (Lin 2011).
1.6 Conclusions
In this chapter, I presented AI technologies from both an historical and a technical perspective. In particular, I clarified a common misconceptions regarding AI technology: AI systems are not intelligent machines and even if the development of AGI prove to be possible, it is not likely it will happen in the short or medium term. While AI systems may seem to perform intelligent actions or to reason intelligently, they do possess an internal cognitive state regarding the actions they are performing (Searle 1990), making them ontologically indistinguishable from other software. The recent advances in AI can be circumscribed as improvements in prediction technologies, one of the critical elements of every decision-making process, but certainly not the only one. Framing AI as a prediction technology is key to understand why and how AI technologies are becoming pervasive in the economy. The progressive digitization of society (which improves the process of data gathering across various dimension of the human experience) increases the opportunities to apply AI technologies in the most diverse areas, thus expanding the range of possible innovations.
References
Agrawal, Ajay, Joshua Gans, and Avi Goldfarb. 2018a. “Prediction, Judgment and Complexity: A Theory of Decision Making and Artificial Intelligence.” National Bureau of Economic Research.
Agrawal, Ajay, Joshua Gans, and Avi Goldfarb. 2018b. Prediction Machines: The Simple Economics of Artificial Intelligence. Cambridge: Harvard Business Press.
Alpaydin, Ethem. 2004. Introduction to Machine Learning. Cambridge: MIT Press.
Ashri, Ronald. 2020. “Core Ai Techniques.” In The Ai-Powered Workplace, 41–57. Springer.
Banham, Mark, and Aggelos Katsaggelos. 1997. “Digital Image Restoration.” IEEE Signal Processing Magazine 14 (2): 24–41.
Borowiec, Steven. 2016. “AlphaGo Seals 4-1 Victory over Go Grandmaster Lee Sedol.” The Guardian. https://www.theguardian.com/technology/2016/mar/15/googles-alphago-seals-4-1-victory-over-grandmaster-lee-sedol.
Brady, Michael. 1985. “Artificial Intelligence and Robotics.” Artificial Intelligence 26 (1): 79–121.
Campbell, Murray, Joseph Hoane, and Feng-hsiung Hsu. 2002. “Deep Blue.” Artificial Intelligence 134 (1-2): 57–83.
Castelvecchi, Davide. 2016. “Can We Open the Black Box of Ai?” Nature News 538 (7623): 20.
Catania, Fabio, Pietro Crovari, Micol Spitale, and Franca Garzotto. 2019. “Automatic Speech Recognition: Do Emotions Matter?” In 2019 Ieee International Conference on Conversational Data & Knowledge Engineering (Cdke), 9–16. IEEE.
Chai, Joyce, Jimmy Lin, Wlodek Zadrozny, Yiming Ye, Margo Stys-Budzikowska, Veronika Horvath, Nanda Kambhatla, and Catherine Wolf. 2001. “The Role of a Natural Language Conversational Interface in Online Sales: A Case Study.” International Journal of Speech Technology 4 (3-4): 285–95.
Chowdhury, Kaushik. 2020. “Natural Language Processing.” In Fundamentals of Artificial Intelligence, 603–49. Springer.
Clark, Alexander, Chris Fox, and Shalom Lappin. 2013. The Handbook of Computational Linguistics and Natural Language Processing. John Wiley & Sons.
Dey, Ayon. 2016. “Machine Learning Algorithms: A Review.” International Journal of Computer Science and Information Technologies 7 (3): 1174–9.
Dreyfus, Hubert. 1972. What Computers Can’t Do: The Limits of Artificial Intelligence. MIT Press.
Eskenazi, Maxine. 1999. “Using Automatic Speech Processing for Foreign Language Pronunciation Tutoring: Some Issues and a Prototype.” Language Learning & Technology 2 (2): 62–76.
Ferrucci, David. 2012. “Introduction to ‘This Is Watson’.” IBM Journal of Research and Development 56 (3.4): 1–1.
Flowers, Johnathan Charles. 2019. “Strong and Weak Ai: Deweyan Considerations.” In AAAI Spring Symposium: Towards Conscious Ai Systems.
Forsyth, David, and Jean Ponce. 2012. “Computer Vision: A Modern Approach. Always Learning.” Pearson.
Gonsalves, Tad. 2019. “The Summers and Winters of Artificial Intelligence.” In Advanced Methodologies and Technologies in Artificial Intelligence, Computer Simulation, and Human-Computer Interaction, 168–79. IGI Global.
Goodfellow, Ian, Yoshua Bengio, Aaron Courville, and Yoshua Bengio. 2016. Deep Learning. Vol. 1. Cambridge: MIT Press.
Grace, Katja, John Salvatier, Allan Dafoe, Baobao Zhang, and Owain Evans. 2018. “When Will Ai Exceed Human Performance? Evidence from Ai Experts.” Journal of Artificial Intelligence Research 62: 729–54.
Greene, David Perry. 1987. “Automated Knowledge Acquisition: Overcoming the Expert System Bottleneck.” In ICIS, 33.
Gupta, Santosh, Kishor Bhurchandi, and Avinash Keskar. 2016. “An Efficient Noise-Robust Automatic Speech Recognition System Using Artificial Neural Networks.” In 2016 International Conference on Communication and Signal Processing (Iccsp), 1873–7. IEEE.
Haenlein, Michael, and Andreas Kaplan. 2019. “A Brief History of Artificial Intelligence: On the Past, Present, and Future of Artificial Intelligence.” California Management Review 61 (4): 5–14.
Henderson, James. 2010. “Artificial Neural Networks.” In The Handbook of Computational Linguistics and Natural Language Processing, 221. Wiley Online Library.
Jiménez, Sergio, Tomás De La Rosa, Susana Fernández, Fernando Fernández, and Daniel Borrajo. 2012. “A Review of Machine Learning for Automated Planning.” The Knowledge Engineering Review 27 (4): 433–67.
Jordan, Michael, and Tom Mitchell. 2015. “Machine Learning: Trends, Perspectives, and Prospects.” Science 349 (6245): 255–60.
Kotsiantis, Sotiris. 2007. “Supervised Machine Learning: A Review of Classification Techniques.” Emerging Artificial Intelligence Applications in Computer Engineering 160 (1): 3–24.
Lee, Yen-Lin, Pei-Kuei Tsung, and Max Wu. 2018. “Techology Trend of Edge Ai.” In 2018 International Symposium on Vlsi Design, Automation and Test (Vlsi-Dat), 1–2. IEEE.
Leonetti, Matteo, Luca Iocchi, and Peter Stone. 2016. “A Synthesis of Automated Planning and Reinforcement Learning for Efficient, Robust Decision-Making.” Artificial Intelligence 241: 103–30.
Lin, Patrick. 2011. “Introduction.” In Robot Ethics: The Ethical and Social Implications of Robotics.
McCulloch, Warren, and Walter Pitts. 1943. “A Logical Calculus of the Ideas Immanent in Nervous Activity.” Bulletin of Mathematical Biology 5: 115–33.
Murphy, Robin. 2000. Introduction to Ai Robotics. MIT Press.
Nassif, Ali Bou, Ismail Shahin, Imtinan Attili, Mohammad Azzeh, and Khaled Shaalan. 2019. “Speech Recognition Using Deep Neural Networks: A Systematic Review.” IEEE Access 7: 19143–65.
Newell, Allen. 1980. “Physical Symbol Systems.” Cognitive Science 4 (2): 135–83.
Newell, Allen, and Herbert Simon. 2007. “Computer Science as Empirical Inquiry: Symbols and Search.” In ACM Turing Award Lectures, 1975.
Nilsson, Nils. 2009. The Quest for Artificial Intelligence. Cambridge University Press.
Partridge, Derek. 1987. “The Scope and Limitations of First Generation Expert Systems.” Future Generation Computer Systems 3 (1): 1–10.
Potamianos, Alexandros, Shrikanth Narayanan, and Sungbok Lee. 1997. “Automatic Speech Recognition for Children.” In Fifth European Conference on Speech Communication and Technology.
Pouyanfar, Samira, Saad Sadiq, Yilin Yan, Haiman Tian, Yudong Tao, Maria Presa Reyes, Mei-Ling Shyu, Shu-Ching Chen, and SS Iyengar. 2018. “A Survey on Deep Learning: Algorithms, Techniques, and Applications.” ACM Computing Surveys (CSUR) 51 (5): 1–36.
Prince, Simon. 2012. Computer Vision: Models, Learning, and Inference. Cambridge University Press.
Rajan, Kanna, and Alessandro Saffiotti. 2017. “Towards a Science of Integrated Ai and Robotics.” Elsevier.
Ray, Tim. 1998. “A Case Study of Britain’s Response to Miti’s Fifth Generation Computer Initiative.” Technology and Innovation in Japan: Policy and Management for the Twenty First Century 18: 151.
Rosenfeld, Azriel. 1988. “Computer Vision: Basic Principles.” Proceedings of the IEEE 76 (8): 863–68.
Russell, Stuart, and Peter Norvig. 2010. Artificial Intelligence: A Modern Approach. test.
Schank, Roger. 1983. “The Current State of Ai: One Man’s Opinion.” AI Magazine 4 (1): 3–3.
Searle, John. 1990. “Minds, Brains, and Programs.” In The Philosophy of Artificial Intelligence.
Sparck Jones, Karen. 1999. “What Is the Role of Nlp in Text Retrieval?” In Natural Language Information Retrieval, 1–24. Springer.
Trucco, Emanuele, and Alessandro Verri. 1998. Introductory Techniques for 3-d Computer Vision. Vol. 201. Prentice Hall Englewood Cliffs.
Weld, Daniel. 1999. “Recent Advances in Ai Planning.” AI Magazine 20 (2): 93–93.
Russell and Norvig (2010) identify four main definitions of artificial intelligence: (1) systems that think like humans; (2) systems that act like humans; (3) systems that think rationally; (4) systems that act rationally. According to Haenlein and Kaplan (2019), this lack of agreement is due to two factors: first, it would require agreeing on how to define human intelligence; second, there is little evidence that proves how much machine intelligence resembles human intelligence.↩︎
Leaving aside the risks of such systems, there are important ethical implications with no easy answer.↩︎
As Agrawal, Gans, and Goldfarb (2018a) affirmed, “We assume that this process of determining payoffs requires human understanding of the situation: it is not a prediction problem”.↩︎
A method that was later renamed as the “Turing Test”↩︎
A system that applied human problem-solving protocols to solve simple problems the development of the high-level language LISP, and the invention of time-sharing systems for optimizing the use of computing resources Nilsson (2009)↩︎
This is also called the “physical symbol systems” hypothesis (Newell 1980)↩︎
From a structural perspective, they are divided into two subsystems: the knowledge base (that contains facts and rules) and the inference engine (that combines rules and known facts to deduce unknown facts).↩︎
Common algorithms used in supervised learning are linear and logistic regression, naïve Bayes, support vector machines, and K-nearest-neighbour (Kotsiantis 2007).↩︎
An example of the output of a payoff function might be game scores: they can be given at every step, as in a ping-pong game, or only in the end, as in chess Russell and Norvig (2010)↩︎
Such as noise (Gupta, Bhurchandi, and Keskar 2016), foreign accents (Eskenazi 1999), and children’s voices (Potamianos, Narayanan, and Lee 1997).↩︎
Poor performance may also depend on training with insufficient data of adverse situations. It is possible that if noisy data is provided during the training process, the model performance will improve (Catania et al. 2019).↩︎
In particular, the autoregressive linguistic model GPT-3, developed by OpenAI and introduced in May 2020, with an incredibly large number of 175 billion parameters, can generate fiction, poetry, newspaper articles, programming code, and probably much more. At the time of writing, this kind of technology might prove to be a game-changer in NLP development.↩︎