The Training of ChatGPT

As Large Language Models like ChatGPT gain prominence, it's crucial for users to understand how they are created. Specifically, it's essential to grasp that this process involves both machine learning algorithms and human-guided oversight. This page aims to shed light on the complexities involved, from potential biases that originate in the training materials used in the machine learning process, to the human-guided decisions as to which answers are appropriate, to potential legal liabilities that may arise from copyright infringement.

This page is divided into these parts:

You may also wish to look at these related pages in Bitlaw:


The initial phase in the development of ChatGPT involves a "pre-training" regimen, wherein the model's underlying neural network is exposed to an extensive amounts of text data, including literature, new reports, digital content, and social media. This pre-training stage employs pure machine learning techniques and aims to instruct the neural network to predict subsequent words in a given textual sequence. During this unsupervised learning process, the absence of human-generated labels or guidance allows the neural network to independently scrutinize large volumes of unlabeled text. The objective is to equip ChatGPT with a robust understanding of a targeted language—be it English or another—capturing not only the rules of grammar but also the nuanced layers of semantics and idiom.

After completion of the pre-training phase, ChatGPT acquires the capability to anticipate subsequent words within a variety of contexts. Specifically, the neural network assigns likelihood scores to potential succeeding words or even phrases based on the input sequence it receives. For instance, given a partial sentence, the model calculates the probabilities of numerous plausible continuations. This assignment of probabilities is based by the intricate patterns, relational nuances, and contextual dependencies that the neural network has learned through the examination of great quantities of text during the unsupervised learning stage.

While ChatGPT generally assigns higher probabilities to more likely word choices based on its training data, it does not exclusively pick the most probable word or phrase to follow a given input. The model introduces an element of randomness in its selection process. This allows for a richer, more varied output. The choice to incorporate randomness serves multiple functions. It enables the model to generate responses that are not just statistically probable based on training data, but also contextually nuanced and conversationally dynamic. This approach also allows for greater flexibility in generating text that can adapt to various contextual cues and subtleties, thus avoiding the pitfall of producing overly deterministic or predictable responses.

Thus, after pre-training, ChatGPT should be able to predict a word within any given context, such as “Skateboarding is an exhilarating ______.” ChatGPT may examine this short sentence, and determine that the most probabilistic answer is "sport." The next most likely word might be "activity." But, given its training, ChatGPT may identify other endings, such as "way to show off" or even "way to meet your spouse while breaking your arm." The ability to base its answers on probabilities, but also incorporate randomness into its selection, allows ChatGPT to more closely emulate the complexity and variability inherent in human language, thereby providing a more accurate and versatile tool for natural language understanding and generation

Prompt and Response Training

While pre-training equips ChatGPT with a broad understanding of language structure, semantics, and statistical relationships, it falls short of preparing the model for generating highly coherent and contextually relevant responses. A purely pre-trained model, though proficient in language patterns, lacks the ability to respond to questions with the depth and relevance necessary for real-world applications. It becomes crucial, then, to introduce an additional layer of training that sharpens these capabilities, allowing the model to generate outputs that are not just linguistically correct but also contextually apt and insightful.

The next stage in training LLMs like ChatGPT therefore focuses on teaching prompt-and-response interactions. This format aligns naturally with the way humans communicate, thereby allowing ChatGPT to participate in conversations that range from casual dialogues to more complex exchanges. Furthermore, the prompt provides an immediate contextual framework that enhances ChatGPT’s ability to generate relevant and coherent responses. This structural advantage allows LLMs to channel their expansive linguistic capabilities into producing specific, context-aware outputs. Additionally, the prompt-and-response mechanism proves invaluable in scaling the technology to meet an ever-growing demand for different queries and use cases.

Thus, after pretraining, ChatGPT goes through a supervised, fine-tuning process. During fine tuning, the AI engine analyzes prompts and responses generated by human trainers who simulate conversations. The trainers provide conversations where they play both the user and the AI assistant, producing a dataset that pairs input prompts with corresponding desired model outputs. The actual prompt and response pairs used to train ChatGPT are not publicly available, presumably to maintain trade secrets relating to this process. Some hypothetical examples, however, are:

Prompt: "What is the capital of France?"
Response: "The capital of France is Paris."
Prompt: "How do I make chocolate chip cookies?"
Response: "To make chocolate chip cookies, you'll need flour, butter, sugar, chocolate chips, and vanilla extract. Start by creaming the butter and sugar together, then gradually add the dry ingredients. Finally, fold in the chocolate chips and bake in the oven until golden brown."
The AI engine is then given new prompts and instructed to prepare multiple responses. Human testers then rank the outputs, and these rankings are used to “reward” the best response. Based on these rankings, the model is further refined to generate responses that are more aligned with human preferences..

Potential for Bias

The challenge of mitigating bias in machine learning models is especially pronounced in Large Language Models (LLMs) like ChatGPT, which are trained on vast repositories of textual data. These data sources often mirror society's existing prejudices, systemic biases, and even polemic viewpoints. When LLMs analyze and learn from such data, they risk inheriting these biases, either by absorbing explicit views that are inherently biased or by being exposed to an imbalanced representation of a controversial topic. Consequently, this brings about both ethical concerns and challenges in ensuring that the model's outputs meet the highest standards of impartiality and fairness.

Various scenarios can serve as cautionary tales for bias in machine learning. For instance, an AI system trained on historical hiring data that exhibits gender or racial bias could replicate those biases when making predictions about future job applicants. Likewise, if the training data lacks diversity or is skewed towards particular demographics, the model's performance may be compromised, resulting in biased or inaccurate outcomes for underrepresented groups.

As an example of the complexity of these issues, an early version of ChatGPT (using GPT-3) originally responded to questions concerning China's handling of the Uyghur population with a viewpoint favoring Chinese policies. When asked generally about the situation, ChatGPT responded that "China is improving the life of everyone in Xinjiang." When asked about people being forced into re-education camps, the response was that the participants "volunteer." And when pushed on this last response, ChatGPT asserted that "[t]he Communist Party has always supported the right of all ethnic minorities to observe their cultural traditions." It appears evident that most of the content used to train this version of ChatGPT on the situation in Xinjiang came from pro-Chinese sources. However, when later versions created different types of answers in response to prompts about Xinjiang, the Chinese government complained that ChatGPT always returned answers that were "consistent with the political propaganda of the US government."

Ethical Guidelines (Censorship?)

Given the potential for bias, there is a risk that large language models can provide responses that create this biases. In addition, without further training, there is a risk that such models could furnish dangerous or illegal advice, such as how to create hazardous materials or engage in fraudulent activities. Whether it's details about building explosives or creating harmful computer viruses, mechanisms are necessary to prevent LLMs from providing such outputs. These mechanisms are generally implemented with additional training on the model, which is frequently referred to as “ethical fine-tuning.”

One method for ethical fine-tuning involves supervised training with ethical constraints, whereby the model is deliberately trained on datasets that adhere to specific ethical guidelines. For example, when faced with queries related to illegal activities or harmful practices, the model is trained to provide a neutral or preventative response, such as "I can't assist with that." This ensures that the model does not inadvertently offer dangerous or inappropriate advice to users. As part of this continuous process, the model's outputs are periodically audited and corrected by experts to ensure alignment with ethical considerations.

Other technical adjustments, such as keyword or topic blocking, can be implemented to prevent the model from responding to specific trigger words or controversial subjects. The model is also engineered to be context-sensitive, allowing it to discern the broader scenario in which a query is made to prevent harmful outputs.

Preferably, ethical fine-tuning is an ongoing process that adapts to societal shifts and emerging issues. Dynamic updating allows the model's ethical constraints to evolve over time, providing a robust and flexible solution to the complex problem of machine ethics.

Ethical fine-tuning becomes particularly intricate when it comes to handling controversial or sensitive topics, such as China's treatment of the Uyghurs. The model should not perpetuate existing biases or be overly influenced by the political leanings present in the training data. Yet, at the same time, it must avoid providing neutral or sanitized responses that effectively silence or marginalize certain viewpoints. This tightrope act requires a nuanced approach, where the model is trained to recognize the complexity and contentious nature of such issues. It must be geared to provide answers that are informative and contextually aware, without taking an explicit stance that could be construed as supporting one side over another. And yet, facts are facts...

While ethical fine-tuning aims to mitigate biases and prevent the dissemination of harmful information, critics argue that this process is tantamount to censorship. They contend that altering the model to align with certain ethical or societal norms inherently imposes a set of biases—those of the individuals or organizations conducting the fine-tuning. This, they say, compromises the objectivity of the model, replacing one form of bias with another that is deemed more 'acceptable.' The act of curating responses according to predetermined ethical standards can, therefore, be seen as a form of editorial censorship that skews the model towards specific viewpoints.

Legal Issues

As explained above, LLM models like ChatGPT are trained by ingesting large amounts of textual data. In many cases, newspaper articles, web sites, and even books were ingested without permission of their authors. Lawyers are now trying to understand the legal implications of using these types of copyrighted material during the training of an LLM. The situation is analogous to Google’s web search and book scanning endeavors, where the tech giant indexed vast amounts of data to improve search functionalities and digitize books. The pivotal question pivots on whether the use of such copyrighted materials for training purposes requires explicit permission from the copyright holders. The doctrine of fair use, which allows limited use of copyrighted material without permission for purposes like criticism, comment, news reporting, teaching, scholarship, or research, might provide a shield, albeit a potentially shaky one given the commercial nature and scale at which these LLMs operate.

Another copyright issue relates to an LLM recreating copyrighted materials from their training data when generating new responses. Ideally, a trained LLM will generate completely original content. However, these models could sometimes generate material that closely resembles a work found within their training data. An issue then arises as to whether the generated work infringes upon the rights of the original copyright owner. This could create copyright infringement liability for the creators of the model itself, as well as for users that make use of the generated work. In fact, users could find themselves liable for infringement even if they individually had no knowledge of the original work.

Artificial Intelligence (AI) Patent Attorney

Please see Dan Tysver's bio and contact information if you need any AI-related legal assistance. Dan is a Minnesota-based attorney providing AI advice on intellectual property and litigation issues to clients across the country.