Introduction to Artificial Intelligence
Artificial Intelligence (AI) has long been the stuff of science fiction, but in recent years, it has transitioned into a reality that is reshaping our daily lives . Beginning with rudimentary expert systems that captured and replicated specific human knowledge, AI has evolved dramatically to give birth to Large Language Models (LLMs) capable of generating human-like text with astonishing proficiency. This page aims to demystify AI by providing a clear definition and walking through its evolutionary journey—from the early days of rule-based systems to the sophisticated neural networks of today.
This page is divided into these parts:
- Expert Systems
- Machine Learning and Neural Networks
- Generative AI & Large Language Models (LLMs)
- Image Generation
You may also wish to look at these related pages in Bitlaw:
A Definition of Artificial Intelligence
Artificial intelligence ("AI") is broadly recognized as the simulation of human intelligence executed by machines, typically computer systems. The core objective of AI is to construct machines capable of performing tasks that would traditionally necessitate human cognitive functions. These functions encompass learning, reasoning, and problem-solving capabilities, as well as the aptitude for making informed decisions.
This basic definition, however, belies the complexity and diversity of AI as a field. It is a multidisciplinary domain that draws upon a variety of techniques, algorithms, and methodologies. These can range from simplistic rule-based systems to intricate neural networks and machine learning models. The range and depth of AI technologies continue to evolve, making it a dynamic and ever-expanding field.
Expert Systems
Expert systems are on the simple end of this spectrum. Expert systems are designed to capture and emulate the knowledge and reasoning of human experts. In these systems, the knowledge and expertise of human specialists are recorded as rules, facts, and logic in a computer program. These rules are typically constructed in an if-then type of format. For example, a rule in a medical expert system might state: “If a patient experiences severe abdominal pain in the lower right quadrant of their abdomen, then they might be having an appendicitis attack.”
One limitation of expert systems is that they rely upon developing explicit rules that define the intelligence imparted by the expert. For instance, the above rule concerning an appendicitis attack could be part of a very large, comprehensive set of interdependent rules that define all human medical diagnoses. A second rule that requires checking the patient’s white blood cell count might be triggered by the first rule relating to abdominal pain.
Knowledge engineering is a pivotal aspect of developing expert systems, serving as the bridge between human expertise and machine capability. Essentially, knowledge engineering involves capturing the specialized knowledge of an expert—in this case, a doctor specializing in diagnosing and treating appendicitis—and translating that knowledge into a format that the computer system can understand. The doctor must painstakingly explain the complex web of symptoms, diagnostic tests, and contextual factors that point towards an appendicitis attack. This often involves hours of interviews, consultations, and reviews to ensure that the system will be both accurate and comprehensive. On the other side, a programmer will then have the challenging task of converting this wealth of medical knowledge into a series of rules and decision-trees that the expert system can utilize. This too is time-consuming, often requiring multiple iterations and extensive testing to ensure reliability.
The development process for both the expert and the programmer is iterative and rigorous. For the doctor, this means constant involvement to clarify ambiguities, validate the drafted rules, and sometimes even update the knowledge base as medical science advances. For the programmer, the work extends beyond mere rule-setting; it involves establishing a user-friendly interface, and integrating the results with existing rules and databases. The intertwining of medical expertise and technical skill in knowledge engineering is both labor-intensive and intricate, underscoring the collaborative nature of creating a proficient expert system.
One significant advantage of expert systems lies in the transparency and verifiability of their decision-making process. Unlike the "black-box" nature of the AI models described below, the rule-based structure of an expert system allows for a clear, step-by-step delineation of how a conclusion was reached. This transparency is invaluable for both troubleshooting and accountability. If an error occurs (e.g., the system misdiagnoses a case of appendicitis), developers can trace back through the decision tree to identify the point of failure. Was an incorrect answer given to a question? Was a prompt ambiguous or poorly worded? Or perhaps the system's rule base lacked the necessary complexity to account for an outlier case.
Being able to scrutinize and dissect the system's logic in such a detailed manner enables timely and precise corrective actions. Developers can refine the wording of prompts, modify existing rules, or introduce new ones to better capture the complexities of the domain expertise. This iterative process of verification and refinement not only enhances the reliability of the expert system but also provides a framework for continuous improvement and adaptability.
While expert systems offer numerous advantages, they are not without drawbacks, chief among them being their inherent inflexibility. Expert systems are developed to function within a very specific domain of knowledge, making them unsuitable for tasks outside their programmed expertise. For instance, an expert system designed for medical diagnoses would be entirely ineffectual when applied to legal issues. Moreover, their rule-based nature makes them sensitive to changes in the domain knowledge. Should new medical findings emerge around appendicitis, for example, the expert system would not automatically adapt to this new information. It would require reprogramming, often an elaborate and time-consuming process, to integrate the new knowledge into its existing rule base.
Machine Learning and Neural Networks
As AI research progressed, the focus shifted away from the manual creation of rules for expert system and toward automated learning approaches. These new approaches did not require an expert to provide rules to a programmer for coding, but rather allowed the machine to train itself on raw data. This concept led to the development of the modern “machine learning” algorithms that form the basis of most artificial intelligence algorithms being used today.
At its core, machine learning is a subset of artificial intelligence that enables computer systems to learn from data and improve their performance over time without being explicitly programmed for every task. Rather than relying on a fixed set of rules curated by human experts, machine learning algorithms derive rules and patterns directly from large datasets. While expert systems require manual updates for even minor changes in domain knowledge, machine learning models can adapt dynamically to make predictions or decisions in new, unseen situations.
Multiple techniques exist in the realm of machine learning, each with its own set of advantages, drawbacks, and ideal use-cases. From decision trees and support vector machines to random forests and naive Bayes classifiers, the array of algorithms at a data scientist's disposal is broad. Among these, however, neural networks stand out for their unparalleled capabilities in handling complex and high-dimensional data. Their ability to automatically learn intricate patterns and representations makes them exceptionally versatile and powerful, particularly for tasks like image and speech recognition, natural language processing, and even game playing. In contrast to other machine learning algorithms that might struggle with the complexity and scale of such problems, neural networks excel at them, often delivering superior performance.
A neural network is constructed from layers of interconnected nodes or "neurons," inspired by the neural structure of the human brain. Each neuron receives inputs, processes them using a weighted sum and an activation function, and passes the result to the neurons in the next layer. These weights are adjusted during the learning process through a technique known as backpropagation, which minimizes the error between the predicted and actual outcomes. The architecture of a neural network can vary significantly, with some networks having just a single layer of neurons while others have multiple layers, known as deep neural networks. The interconnections between neurons, the associated weights, and the method of adjusting these weights all contribute to the network's ability to learn and make predictions or decisions.
Neural networks can be defined in layers, and the term "deep" in deep learning refers to a large number of layers being implemented in the neural network. For example, a neural network designed to identify whether a photograph contains an image of a cat or a dog may utilize three layers:
- Layer 1, which recognizes edges in an image (lines and arcs).
- Layer 2, which recognizes shapes in the image based on the recognized edges (triangles and circles).
- Layer 3, which recognizes objects in the image based on the recognized shapes (dogs and cats).
Generative AI & Large Language Models (LLMs)
Generative AI is a type of AI that is designed to create new content that resembles the example content it was trained upon. To achieve this, it learns the underlying patterns and characteristics of the data it encounters using deep learning. Effectively, the generative AI learns the probability distribution of the data that it analyzes. With this knowledge, it can then generate fresh output that is similar to what it has seen before while being completely original. These generative AI models can produce various types of content like images, text, and audio, all inspired by the patterns found in the training data.
Large language models (or LLMs) are a type of generative AI that can produce text in the form of properly formatted sentences and paragraphs. These models are trained on vast amounts of text data, which allows them to learn the patterns and relationships present in human language. This text data can be drawn from many sources, but it is mostly taken from the Internet. During training, LLMs analyze the data using deep learning, and develop an understanding of the statistical probabilities of the words and phrases occurring in different contexts. This knowledge is then used during language generation. When given a prompt, the model predicts the most probable next word or sequence of words relevant to that prompt based on the “understanding” of language patterns as learned from its training data.
ChatGPT is a conversational LLM, meaning that it allows users to interact with the AI using chat-type human conversations. A conversational LLM simulates human-like conversations with users, which create a more immersive experience. A conversational LLM is trained to analyze user queries, retrieve relevant information, and generate relevant responses based upon the learning of its neural network. Modern systems such as ChatGPT can grasp nuances in the way questions are posed, which allows it to respond more helpfully. Furthermore, conversational LLMs can tailor responses to individual users based on their unique preferences and past interactions. These types of systems generally group interactions into separate conversations. All of the communications within a single chat grouping are used by the LLM as the relevant context to formulate each new response.
ChatGPT is a particular application that uses an advanced conversational LLM models. In November of 2022, ChatGPT first began using version 3.5 of its LLM module, which was aptly named GPT-3.5. In March of 2023, ChatGPT began using the GPT-4.0 model. Both of these models, and ChatGPT itself, were developed by and are distributed by OpenAI of San Francisco, CA. Bard is another LLM, this one created by Google. Bing Chat is Microsoft’s LLM, which generally requires use of the Microsoft Edge browser.
Image Generation
Some generative AIs focus on the generation of images. These types of AIs are exposed to a vast quantity of landscape photography, abstract art, portraits, corporate logos, anime, and drawings. By studying these input images, the AI grasps the underlying patterns and intricacies, enabling it to understand how each style or type of image is constructed. Once the AI has learned from these examples, it can generate entirely new images from scratch. Given a random starting point in the form of a prompt, it uses its understanding of different artistic elements and styles to create original artwork that resembles portions of the input data. It can produce vivid landscapes, abstract compositions, lifelike portraits, and much more, making it an impressive digital artist. And just like with ChatGPT, a user can improve the results by changing the provided prompts. For example, the user could provide initial instructions to the AI, such as requesting an abstract landscape with vibrant colors and a dreamy atmosphere. The AI then generates a first draft based on these guidelines. The prompter can review the initial output, offer further prompts that provide a better description of the desired mood and a request for more blue, and the AI engine will refine the image appropriately.
Artificial Intelligence (AI) Patent Attorney
Please see Dan Tysver's bio and contact information if you need any AI-related legal assistance. Dan is a Minnesota-based attorney providing AI advice on intellectual property and litigation issues to clients across the country.