mirror of
https://github.com/datawhalechina/llms-from-scratch-cn.git
synced 2026-01-13 16:57:18 +08:00
2392 lines
144 KiB
Plaintext
2392 lines
144 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "45398736-7e89-4263-89c8-92153baff553",
|
||
"metadata": {},
|
||
"source": [
|
||
"<font size=\"1\">\n",
|
||
"Supplementary code for \"Build a Large Language Model From Scratch\": <a href=\"https://www.manning.com/books/build-a-large-language-model-from-scratch\">https://www.manning.com/books/build-a-large-language-model-from-scratch</a> by <a href=\"https://sebastianraschka.com\">Sebastian Raschka</a><br>\n",
|
||
"Code repository: <a href=\"https://github.com/rasbt/LLMs-from-scratch\">https://github.com/rasbt/LLMs-from-scratch</a>\n",
|
||
"</font>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "66dd524e-864c-4012-b0a2-ccfc56e80024",
|
||
"metadata": {
|
||
"id": "66dd524e-864c-4012-b0a2-ccfc56e80024"
|
||
},
|
||
"source": [
|
||
"# Chapter 5: 在未标记数据上进行预训练"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 52,
|
||
"id": "92b989e9-da36-4159-b212-799184764dd9",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"matplotlib version: 3.5.2\n",
|
||
"numpy version: 1.24.4\n",
|
||
"tiktoken version: 0.6.0\n",
|
||
"torch version: 2.1.0\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"from importlib.metadata import version\n",
|
||
"\n",
|
||
"pkgs = [\"matplotlib\", \"numpy\", \"tiktoken\", \"torch\"]\n",
|
||
"for p in pkgs:\n",
|
||
" print(f\"{p} version: {version(p)}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0a3bdf9e-2ff0-4a57-abab-ede2d955a237",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 在本章中,我们将实现训练循环及基本模型评估代码,以预训练一个LLM\n",
|
||
"- 本章结尾处,我们还将加载OpenAI提供的公开可用的预训练权重并将其导入到我们的模型中"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "efd27fcc-2886-47cb-b544-046c2c31f02a",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"images/img-1.webp\" width=500px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0d214765-7a73-42d5-95e9-302154b29db9",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 本章所涵盖的主题如下图所示"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "f67711d4-8391-4fee-aeef-07ea53dd5841",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"images/img-2.webp\" width=400px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0d824183-145c-4865-89e1-1f0d0a338f19",
|
||
"metadata": {
|
||
"id": "0d824183-145c-4865-89e1-1f0d0a338f19"
|
||
},
|
||
"source": [
|
||
"## 5.1 评估文本生成模型"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a3350f8c-5181-4f9b-a789-4523105e98f2",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们首先简要回顾一下使用上一章中的代码初始化 GPT 模型\n",
|
||
"- 然后,我们讨论 LLM 的基本评估指标\n",
|
||
"- 最后,在本节中,我们将这些评估指标应用于训练和验证数据集"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "bdc1cf3f-82d8-46c7-9ecc-58979ce87cdd",
|
||
"metadata": {
|
||
"id": "bdc1cf3f-82d8-46c7-9ecc-58979ce87cdd"
|
||
},
|
||
"source": [
|
||
"### 5.1.1 使用 GPT 生成文本"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5b3415fd-9f4a-4548-908e-9dfa56edc9bc",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们使用上一章中的代码初始化 GPT 模型"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"id": "86000d74-624a-48f0-86da-f41926cb9e04",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "86000d74-624a-48f0-86da-f41926cb9e04",
|
||
"outputId": "ad482cfd-5a62-4f0d-e1e0-008d6457f512"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import torch\n",
|
||
"from previous_chapters import GPTModel\n",
|
||
"\n",
|
||
"GPT_CONFIG_124M = {\n",
|
||
" \"vocab_size\": 50257, # Vocabulary size\n",
|
||
" \"ctx_len\": 256, # Shortened context length (orig: 1024)\n",
|
||
" \"emb_dim\": 768, # Embedding dimension\n",
|
||
" \"n_heads\": 12, # Number of attention heads\n",
|
||
" \"n_layers\": 12, # Number of layers\n",
|
||
" \"drop_rate\": 0.1, # Dropout rate\n",
|
||
" \"qkv_bias\": False # Query-key-value bias\n",
|
||
"}\n",
|
||
"\n",
|
||
"torch.manual_seed(123)\n",
|
||
"model = GPTModel(GPT_CONFIG_124M)\n",
|
||
"model.eval(); # Disable dropout during inference"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "09c6cf0f-7458-48a2-97fd-aa5068d65e8c",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们在上面使用0.1的dropout,但现在的llm训练中通常没有dropout\n",
|
||
"- 现在的llm也不会在查询,键和值矩阵的`nn.Linear`层中使用偏差向量 (与早期的GPT模型不同),这是通过设置`“qkv_bias”: False`来实现的\n",
|
||
"- 我们只用256个token的上下文长度 (`ctx_len`),以减少训练模型的计算资源需求,而原始的1.24亿参数GPT-2模型使用1024个token\n",
|
||
" - 这是为了让更多的读者能够在他们的笔记本电脑上执行代码示例\n",
|
||
" - 但是,请随意增加`ctx_len`到1024token (这不需要任何代码更改)\n",
|
||
" - 之后我们还将从预训练的权重加载具有`ctx_len = 1024`的模型"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "59f80895-be35-4bb5-81cb-f357ef7367fe",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 接下来,我们使用上一章中的`generate_text_simple`函数来生成文本。\n",
|
||
"- 此外,我们定义了两个函数`text_to_token_ids`和`token_ids_to_text`,用于在本章中进行标记和文本表示之间的转换。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "741881f3-cee0-49ad-b11d-b9df3b3ac234",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"images/img-3.webp\" width=500px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"id": "5e062b82-3540-48ce-8eb4-009686d0d16c",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Output text:\n",
|
||
" Every effort moves you rentingetic wasnم refres RexMeCHicular stren\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"import tiktoken\n",
|
||
"from previous_chapters import generate_text_simple\n",
|
||
"\n",
|
||
"def text_to_token_ids(text, tokenizer):\n",
|
||
" encoded = tokenizer.encode(text, allowed_special={'<|endoftext|>'})\n",
|
||
" encoded_tensor = torch.tensor(encoded).unsqueeze(0) # 增加batch维度\n",
|
||
" return encoded_tensor\n",
|
||
"\n",
|
||
"def token_ids_to_text(token_ids, tokenizer):\n",
|
||
" flat = token_ids.squeeze(0) # 去掉batch维度\n",
|
||
" return tokenizer.decode(flat.tolist())\n",
|
||
"\n",
|
||
"start_context = \"Every effort moves you\"\n",
|
||
"tokenizer = tiktoken.get_encoding(\"gpt2\")\n",
|
||
"\n",
|
||
"token_ids = generate_text_simple(\n",
|
||
" model=model,\n",
|
||
" idx=text_to_token_ids(start_context, tokenizer),\n",
|
||
" max_new_tokens=10,\n",
|
||
" context_size=GPT_CONFIG_124M[\"ctx_len\"]\n",
|
||
")\n",
|
||
"\n",
|
||
"print(\"Output text:\\n\", token_ids_to_text(token_ids, tokenizer))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "e4d3249b-b2a0-44c4-b589-ae4b403b8305",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 如上所述,模型未能生成好的文本,因为它尚未经过训练。\n",
|
||
"- 我们如何以数值形式衡量或捕捉“好的文本”,以便在训练过程中进行跟踪?\n",
|
||
"- 下一小节将介绍用于计算生成输出的损失指标的度量标准,我们可以使用这些度量标准来衡量训练进度。\n",
|
||
"- 在后续关于微调大型语言模型(LLMs)的章节中,也将介绍其他衡量模型质量的方法。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0f3d7ea2-637f-4490-bc76-e361fc81ae98",
|
||
"metadata": {
|
||
"id": "0f3d7ea2-637f-4490-bc76-e361fc81ae98"
|
||
},
|
||
"source": [
|
||
"### 5.1.2 计算文本生成损失:交叉熵和困惑度"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "9e1ba8aa-fb03-4d25-957f-fe8778762440",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 假设我们有一个`inputs`张量,包含了2个训练样本(行)的标记ID。\n",
|
||
"- 对应于`inputs`,`targets`包含了我们希望模型生成的期望标记ID。\n",
|
||
"- 请注意,`targets`是`inputs`向右移动了一个位置,正如第2章中实现数据加载器时所解释的那样。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"id": "6b5402f8-ec0c-4a44-9892-18a97779ee4f",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "6b5402f8-ec0c-4a44-9892-18a97779ee4f",
|
||
"outputId": "8d6fa0ff-7b37-4634-c3f0-2c050cbe81f0"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"inputs = torch.tensor([[16833, 3626, 6100], # [\"every effort moves\",\n",
|
||
" [40, 1107, 588]]) # \"I really like\"]\n",
|
||
"\n",
|
||
"targets = torch.tensor([[3626, 6100, 345 ], # [\" effort moves you\",\n",
|
||
" [588, 428, 11311]]) # \" really like chocolate\"]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "33dc0645-ac2c-4973-9b40-6da40515bede",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 将`inputs`输入模型后,我们获得了包含3个标记的2个输入样本的logits向量。\n",
|
||
"- 每个标记都是一个50,257维的向量,对应于词汇表的大小。\n",
|
||
"- 应用softmax函数,我们可以将logits张量转换为一个相同维度的张量,其中包含概率分数。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"id": "e7b6ec51-6f8c-49bd-a349-95ba38b46fb6",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"torch.Size([2, 3, 50257])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"with torch.no_grad():\n",
|
||
" logits = model(inputs)\n",
|
||
"\n",
|
||
"probas = torch.softmax(logits, dim=-1) # 词表中每个标记的预测概率\n",
|
||
"print(probas.shape) # Shape: (batch_size, num_tokens, vocab_size)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5c36a382-b5e2-4de6-9e65-0b69b685013b",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 下图为了说明目的使用了一个非常小的词汇表,概述了我们如何将概率分数转换回文本,这一点我们在上一章的末尾进行了讨论。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "384d86a9-0013-476c-bb6b-274fd5f20b29",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/proba-to-text.webp\" width=500px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "e8480efd-d419-4954-9ecc-2876055334bd",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 正如在前一章中讨论的,我们可以应用`argmax`函数将概率分数转换为预测的标记ID。\n",
|
||
"- 上文提到的softmax函数为每个标记生成了一个50,257维的向量;`argmax`函数返回这个向量中最高概率分数的位置,即给定标记的下一个预测标记ID。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "f3b84c9f-dd08-482e-b903-a86fe44e1144",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 由于我们有2个输入批次,每个批次包含3个标记,因此我们获得了2个3维的预测标记ID:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"id": "34ebd76a-16ec-4c17-8958-8a135735cc1c",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "34ebd76a-16ec-4c17-8958-8a135735cc1c",
|
||
"outputId": "ed17da47-c3e7-4775-fd00-4ec5bcda3db2"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Token IDs:\n",
|
||
" tensor([[[16657],\n",
|
||
" [ 339],\n",
|
||
" [42826]],\n",
|
||
"\n",
|
||
" [[49906],\n",
|
||
" [29669],\n",
|
||
" [41751]]])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"token_ids = torch.argmax(probas, dim=-1, keepdim=True)\n",
|
||
"print(\"Token IDs:\\n\", token_ids)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "cee4072c-21ed-4df7-8721-dd2535362573",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 如果我们解码这些标记,我们会发现它们与我们希望模型预测的标记,即目标标记,相当不同:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"id": "c990ead6-53cd-49a7-a6d1-14d8c1518249",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Targets batch 1: effort moves you\n",
|
||
"Outputs batch 1: Armed heNetflix\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(f\"Targets batch 1: {token_ids_to_text(targets[0], tokenizer)}\")\n",
|
||
"print(f\"Outputs batch 1: {token_ids_to_text(token_ids[0].flatten(), tokenizer)}\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a53eb8a7-070e-46d6-930c-314ba55a6ff2",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 那是因为模型还没有被训练。\n",
|
||
"- 为了训练模型,我们需要知道它离正确预测(目标)有多远。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ad90592f-0d5d-4ec8-9ff5-e7675beab10e",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/proba-index.webp\" width=500px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c7251bf5-a079-4782-901d-68c9225d3157",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 对应于目标索引的标记概率如下:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"id": "54aef09c-d6e3-4238-8653-b3a1b0a1077a",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "54aef09c-d6e3-4238-8653-b3a1b0a1077a",
|
||
"outputId": "41c946a2-c458-433e-a53d-5e7e89d9dddc"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Batch 1: tensor([7.4540e-05, 3.1061e-05, 1.1563e-05])\n",
|
||
"Batch 2: tensor([3.9836e-05, 1.6783e-05, 4.7559e-06])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"batch_idx = 0\n",
|
||
"target_probas_1 = probas[batch_idx, [0, 1, 2], targets[batch_idx]]\n",
|
||
"print(\"Batch 1:\", target_probas_1)\n",
|
||
"\n",
|
||
"batch_idx = 1\n",
|
||
"target_probas_2 = probas[1, [0, 1, 2], targets[1]]\n",
|
||
"print(\"Batch 2:\", target_probas_2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a0e89a19-73c2-4e49-93b4-861f699f1cbf",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们希望最大化所有这些值,使它们接近1的概率。\n",
|
||
"- 在数学优化中,最大化概率分数的对数比分数值本身更容易;这超出了本书的范围,但我在这里录制了一个更详细的讲座:[L8.2 逻辑回归损失函数](https://www.youtube.com/watch?v=GxJe0DZvydM)。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"id": "31402a67-a16e-4aeb-977e-70abb9c9949b",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "31402a67-a16e-4aeb-977e-70abb9c9949b",
|
||
"outputId": "1bf18e79-1246-4eab-efd8-12b328c78678"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"tensor([ -9.5042, -10.3796, -11.3677, -10.1308, -10.9951, -12.2561])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# 计算所有标记的预测概率的对数值\n",
|
||
"log_probas = torch.log(torch.cat((target_probas_1, target_probas_2)))\n",
|
||
"print(log_probas)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c4261441-a511-4633-9c4c-67998af31b84",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 接下来,我们计算平均对数概率:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"id": "9b003797-161b-4d98-81dc-e68320e09fec",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "9b003797-161b-4d98-81dc-e68320e09fec",
|
||
"outputId": "a447fe9c-7e27-40ed-f1fb-51210e3f7cc9"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"tensor(-10.7722)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# 对所有标记的概率对数值求均值\n",
|
||
"avg_log_probas = torch.mean(log_probas)\n",
|
||
"print(avg_log_probas)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "36d51994-ad17-4ba3-a6ec-f588b4b13585",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 目标是通过优化模型权重,使得这个平均对数概率尽可能大。\n",
|
||
"- 由于对数函数的特性,最大可能的值是0,而我们目前远离0。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "3de388a1-8a0a-4c94-8894-9041dc6ad514",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 在深度学习中,我们通常不是最大化平均对数概率,而是遵循标准惯例来最小化平均对数概率的*负值*;在我们的例子中,不是最大化-10.7722使其接近0,在深度学习中,我们会最小化10.7722使其接近0。\n",
|
||
"- 负-10.7722的值,即10.7722,在深度学习中也被称为交叉熵损失。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"id": "176ddf35-1c5f-4d7c-bf17-70f3e7069bd4",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"tensor(10.7722)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"neg_avg_log_probas = avg_log_probas * -1\n",
|
||
"print(neg_avg_log_probas)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "84eeb868-abd8-4028-82db-107546bf7c2c",
|
||
"metadata": {},
|
||
"source": [
|
||
"- PyTorch 已经实现了一个 `cross_entropy` 函数,该函数执行了前面的步骤。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5bd24b7f-b760-47ad-bc84-86d13794aa54",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/cross-entropy.webp\" width=400px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "e8aaf9dd-3ee6-42bf-a63f-6e93dbfb989d",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 在我们应用交叉熵函数之前,让我们先检查一下logits和targets的形状。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"id": "695d6f64-5084-4c23-aea4-105c9e38cfe4",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "695d6f64-5084-4c23-aea4-105c9e38cfe4",
|
||
"outputId": "43fd802a-8136-4b35-df0d-f61a5d4cb561"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Logits shape: torch.Size([2, 3, 50257])\n",
|
||
"Targets shape: torch.Size([2, 3])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# Logits向量的形状 (batch_size, num_tokens, vocab_size)\n",
|
||
"print(\"Logits shape:\", logits.shape)\n",
|
||
"\n",
|
||
"# 目标向量的形状 (batch_size, num_tokens)\n",
|
||
"print(\"Targets shape:\", targets.shape)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "1d3d65f0-6566-4865-93e4-0c0bcb10cd06",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 对于PyTorch中的`entropy_loss`函数,我们希望通过在批次(batch)维度上合并它们来展平(flatten)这些张量:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"id": "0e17e027-ab9f-4fb5-ac9b-a009b831c122",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "0e17e027-ab9f-4fb5-ac9b-a009b831c122",
|
||
"outputId": "0b2b778b-02fb-43b2-c879-adc59055a7d8"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Flattened logits: torch.Size([6, 50257])\n",
|
||
"Flattened targets: torch.Size([6])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"logits_flat = logits.flatten(0, 1)\n",
|
||
"targets_flat = targets.flatten()\n",
|
||
"\n",
|
||
"print(\"Flattened logits:\", logits_flat.shape)\n",
|
||
"print(\"Flattened targets:\", targets_flat.shape)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4921a57f-3a79-473e-a863-6d63b495010f",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 请注意,目标(targets)是标记ID,它们也代表了我们希望在logits张量中最大化的索引位置。\n",
|
||
"- PyTorch中的`cross_entropy`函数会自动地将softmax和对数概率计算应用到这些要最大化标记索引的logits上"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"id": "62d0816e-b29a-4c8f-a9a5-a167562de978",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "62d0816e-b29a-4c8f-a9a5-a167562de978",
|
||
"outputId": "c0be634a-2c65-4ff7-a73f-1bfc2e406ba4"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"tensor(10.7722)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"loss = torch.nn.functional.cross_entropy(logits_flat, targets_flat)\n",
|
||
"print(loss)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0f15ce17-fd7b-4d8e-99da-b237523a7a80",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 一个 与交叉熵损失相关的概念是大型语言模型(LLM)的困惑度。\n",
|
||
"- 困惑度简单地说就是交叉熵损失的指数函数计算结果"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"id": "168952a1-b964-4aa7-8e49-966fa26add54",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "168952a1-b964-4aa7-8e49-966fa26add54",
|
||
"outputId": "a0a692c1-6412-4068-8aa5-8858548141eb"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"tensor(47678.8633)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"perplexity = torch.exp(loss)\n",
|
||
"print(perplexity)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "71ae26dd-d77e-41fd-b924-6bd103dd4ee7",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 困惑度通常被认为更具可解释性,因为它可以被理解为模型在每一步中对下一标记所不确定的词表大小(在上面的例子中,这将是47,678个单词或标记)。\n",
|
||
"- 换句话说,困惑度提供了一种衡量模型预测的概率分布与数据集中单词实际分布匹配程度的方法。\n",
|
||
"- 与损失类似,较低的困惑度表明模型预测更接近实际分布。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "2ec6c217-e429-40c7-ad71-5d0a9da8e487",
|
||
"metadata": {
|
||
"id": "2ec6c217-e429-40c7-ad71-5d0a9da8e487"
|
||
},
|
||
"source": [
|
||
"### 5.1.3 计算训练集和验证集损失"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "530da89e-2448-436c-8f1b-28e8a31ef85c",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们使用一个相对较小的数据集来训练大型语言模型(LLM)(实际上,只有一个短篇故事)。\n",
|
||
" - 原因包括:\n",
|
||
" - 你可以在没有合适GPU的笔记本电脑上在几分钟内运行代码示例。\n",
|
||
" - 训练完成得相对较快(几分钟而不是几周),这对我们的教育目的来说很好。\n",
|
||
" - 我们使用的是公有领域的文本,可以包含在这个GitHub仓库中而不会违反任何使用权或增加仓库大小。\n",
|
||
"\n",
|
||
"- 例如,Llama 2 7B在A100 GPU上需要184,320小时的训练时间才能在2万亿个标记上完成训练。\n",
|
||
" - 在撰写本文时,AWS上8xA100云服务器的每小时成本大约为30美元。\n",
|
||
" - 因此,通过一个粗略的计算,训练这个LLM的成本将是 184,320 / 8 * 30美元 = 69万美元。\n",
|
||
"\n",
|
||
"- 下面,我们将使用第2章中使用过的相同数据集。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"id": "654fde37-b2a9-4a20-a8d3-0206c056e2ff",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import os\n",
|
||
"import urllib.request\n",
|
||
"\n",
|
||
"file_path = \"the-verdict.txt\"\n",
|
||
"url = \"https://github.com/rasbt/LLMs-from-scratch/main/ch02/01_main-chapter-code/the-verdict.txt\"\n",
|
||
"\n",
|
||
"if not os.path.exists(file_path):\n",
|
||
" with urllib.request.urlopen(url) as response:\n",
|
||
" text_data = response.read().decode('utf-8')\n",
|
||
" with open(file_path, \"w\", encoding=\"utf-8\") as file:\n",
|
||
" file.write(text_data)\n",
|
||
"else:\n",
|
||
" with open(file_path, \"r\", encoding=\"utf-8\") as file:\n",
|
||
" text_data = file.read()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "379330f1-80f4-4e34-8724-41d892b04cee",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 通过打印前100个和后100个单词来快速检查文本是否正确加载。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"id": "6kgJbe4ehI4q",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 35
|
||
},
|
||
"id": "6kgJbe4ehI4q",
|
||
"outputId": "9ff31e88-ee37-47e9-ee64-da6eb552f46f"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"I HAD always thought Jack Gisburn rather a cheap genius--though a good fellow enough--so it was no \n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# First 100 characters\n",
|
||
"print(text_data[:99])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"id": "j2XPde_ThM_e",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 35
|
||
},
|
||
"id": "j2XPde_ThM_e",
|
||
"outputId": "a900c1b9-9a87-4078-968b-a5721deda5cb"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"it for me! The Strouds stand alone, and happen once--but there's no exterminating our kind of art.\"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# Last 100 characters\n",
|
||
"print(text_data[-99:])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"id": "6b46a952-d50a-4837-af09-4095698f7fd1",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "6b46a952-d50a-4837-af09-4095698f7fd1",
|
||
"outputId": "c2a25334-21ca-486e-8226-0296e5fc6486"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Characters: 20479\n",
|
||
"Tokens: 5145\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"total_char = len(text_data)\n",
|
||
"total_tokens = len(tokenizer.encode(text_data))\n",
|
||
"\n",
|
||
"print(\"Characters:\", total_char)\n",
|
||
"print(\"Tokens:\", total_tokens)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a8830cb9-90f6-4e7c-8620-beeabc2d39f7",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 虽然只有5,145个标记,对于训练一个大型语言模型(LLM)来说,这段文本非常短,但再次强调,这是出于教育目的(我们稍后还会加载预训练的权重)。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "bedcad87-a0e8-4b9d-ac43-4e927ccbb50f",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 接下来,我们将数据集划分为训练集和验证集,并使用第2章中的数据加载器为大型语言模型(LLM)训练准备批次数据。\n",
|
||
"- 为了可视化目的,下面的图表假设`max_length=6`,但对于训练加载器,我们将`max_length`设置为LLM支持的上下文长度。\n",
|
||
"- 下面的图表仅显示输入标记以简化表示。\n",
|
||
" - 由于我们训练LLM来预测文本中的下一个单词,目标标记看起来与这些输入标记相同,只是目标标记向右移动了一个位置。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "46bdaa07-ba96-4ac1-9d71-b3cc153910d9",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/batching.webp\" width=500px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"id": "0959c855-f860-4358-8b98-bc654f047578",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"from previous_chapters import create_dataloader_v1\n",
|
||
"\n",
|
||
"# 训练集/验证集数据比\n",
|
||
"train_ratio = 0.90\n",
|
||
"split_idx = int(train_ratio * len(text_data))\n",
|
||
"train_data = text_data[:split_idx]\n",
|
||
"val_data = text_data[split_idx:]\n",
|
||
"\n",
|
||
"\n",
|
||
"torch.manual_seed(123)\n",
|
||
"\n",
|
||
"train_loader = create_dataloader_v1(\n",
|
||
" train_data,\n",
|
||
" batch_size=2,\n",
|
||
" max_length=GPT_CONFIG_124M[\"ctx_len\"],\n",
|
||
" stride=GPT_CONFIG_124M[\"ctx_len\"],\n",
|
||
" drop_last=True,\n",
|
||
" shuffle=True\n",
|
||
")\n",
|
||
"\n",
|
||
"val_loader = create_dataloader_v1(\n",
|
||
" val_data,\n",
|
||
" batch_size=2,\n",
|
||
" max_length=GPT_CONFIG_124M[\"ctx_len\"],\n",
|
||
" stride=GPT_CONFIG_124M[\"ctx_len\"],\n",
|
||
" drop_last=False,\n",
|
||
" shuffle=False\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"id": "f37b3eb0-854e-4895-9898-fa7d1e67566e",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 合理性检查:为了确保训练集和验证集中数据量大于模型的上下文窗口,避免出现训练/验证错误\n",
|
||
"\n",
|
||
"if total_tokens * (train_ratio) < GPT_CONFIG_124M[\"ctx_len\"]:\n",
|
||
" print(\"Not enough tokens for the training loader. \"\n",
|
||
" \"Try to lower the `GPT_CONFIG_124M['ctx_len']` or \"\n",
|
||
" \"increase the `training_ratio`\")\n",
|
||
"\n",
|
||
"if total_tokens * (1-train_ratio) < GPT_CONFIG_124M[\"ctx_len\"]:\n",
|
||
" print(\"Not enough tokens for the validation loader. \"\n",
|
||
" \"Try to lower the `GPT_CONFIG_124M['ctx_len']` or \"\n",
|
||
" \"decrease the `training_ratio`\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "e7ac3296-a4d1-4303-9ac5-376518960c33",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们使用相对较小的批次大小来减少计算资源的需求,并且因为数据集本身起初就非常小。\n",
|
||
"- 例如,Llama 2 7B就是使用1024的批次大小进行训练的。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a8e0514d-b990-4dc0-9afb-7721993284a0",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 一个可选的检查,以确认数据是否已正确加载:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"id": "ca0116d0-d229-472c-9fbf-ebc229331c3e",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Train loader:\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n",
|
||
"\n",
|
||
"Validation loader:\n",
|
||
"torch.Size([2, 256]) torch.Size([2, 256])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(\"Train loader:\")\n",
|
||
"for x, y in train_loader:\n",
|
||
" print(x.shape, y.shape)\n",
|
||
"\n",
|
||
"print(\"\\nValidation loader:\")\n",
|
||
"for x, y in val_loader:\n",
|
||
" print(x.shape, y.shape)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "f7b9b1a4-863d-456f-a8dd-c07fb5c024ed",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 另一个可选的检查,以确认标记大小是否在预期的范围内:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"id": "eb860488-5453-41d7-9870-23b723f742a0",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "eb860488-5453-41d7-9870-23b723f742a0",
|
||
"outputId": "96b9451a-9557-4126-d1c8-51610a1995ab"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Training tokens: 4608\n",
|
||
"Validation tokens: 512\n",
|
||
"All tokens: 5120\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"train_tokens = 0\n",
|
||
"for input_batch, target_batch in train_loader:\n",
|
||
" train_tokens += input_batch.numel() # 使用numel()函数统计一个batch中的token数量\n",
|
||
"\n",
|
||
"val_tokens = 0\n",
|
||
"for input_batch, target_batch in val_loader:\n",
|
||
" val_tokens += input_batch.numel()\n",
|
||
"\n",
|
||
"print(\"Training tokens:\", train_tokens)\n",
|
||
"print(\"Validation tokens:\", val_tokens)\n",
|
||
"print(\"All tokens:\", train_tokens + val_tokens)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "5c3085e8-665e-48eb-bb41-cdde61537e06",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 接下来,我们实现一个实用工具函数来计算给定批次的交叉熵损失。\n",
|
||
"- 此外,我们实现了第二个实用工具函数,用于计算数据加载器中用户指定数量批次的总损失。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"id": "7b9de31e-4096-47b3-976d-b6d2fdce04bc",
|
||
"metadata": {
|
||
"id": "7b9de31e-4096-47b3-976d-b6d2fdce04bc"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def calc_loss_batch(input_batch, target_batch, model, device):\n",
|
||
" input_batch, target_batch = input_batch.to(device), target_batch.to(device)\n",
|
||
"\n",
|
||
" logits = model(input_batch)\n",
|
||
" logits = logits.flatten(0, 1)\n",
|
||
" loss = torch.nn.functional.cross_entropy(logits, target_batch.flatten())\n",
|
||
" return loss\n",
|
||
"\n",
|
||
"\n",
|
||
"def calc_loss_loader(data_loader, model, device, num_batches=None): # num_batches为计算损失的批次范围\n",
|
||
" total_loss = 0.\n",
|
||
" if num_batches is None:\n",
|
||
" num_batches = len(data_loader)\n",
|
||
" else:\n",
|
||
" # 取num_batches和len(data_loader)两者较小值以匹配data_loader中的总批次数量\n",
|
||
" num_batches = min(num_batches, len(data_loader))\n",
|
||
" for i, (input_batch, target_batch) in enumerate(data_loader):\n",
|
||
" if i < num_batches:\n",
|
||
" loss = calc_loss_batch(input_batch, target_batch, model, device)\n",
|
||
" total_loss += loss.item()\n",
|
||
" else:\n",
|
||
" break\n",
|
||
" return total_loss / num_batches"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "f0691332-84d0-48b3-b462-a885ddeb4fca",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 如果你拥有一台装有支持CUDA的GPU的计算机,大型语言模型(LLM)将在GPU上进行训练,无需对代码做任何更改。\n",
|
||
"- 通过`device`设置,我们确保数据被加载到与LLM模型相同的设备上。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"id": "56f5b0c9-1065-4d67-98b9-010e42fc1e2a",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Training loss: 10.98758347829183\n",
|
||
"Validation loss: 10.98110580444336\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"device = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n",
|
||
"model.to(device) # 对于nn.Module类的模型,不需要执行model = model.to(device)这样的赋值操作。\n",
|
||
"\n",
|
||
"\n",
|
||
"torch.manual_seed(123) # 出于代码结果的可复现性的考虑,显式地设定manual_seed\n",
|
||
"train_loss = calc_loss_loader(train_loader, model, device)\n",
|
||
"val_loss = calc_loss_loader(val_loader, model, device)\n",
|
||
"\n",
|
||
"print(\"Training loss:\", train_loss)\n",
|
||
"print(\"Validation loss:\", val_loss)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "43875e95-190f-4b17-8f9a-35034ba649ec",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/mental-model-1.webp\" width=400px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "b9339f8d-00cb-4206-af67-58c32bd72055",
|
||
"metadata": {
|
||
"id": "b9339f8d-00cb-4206-af67-58c32bd72055"
|
||
},
|
||
"source": [
|
||
"## 5.2 训练一个大型语言模型(LLM)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "652a4cf4-e98f-46d9-bdec-60e7ccb8d6bd",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 在本节中,我们最终实现了训练大型语言模型(LLM)的代码。\n",
|
||
"- 我们专注于一个简单的训练函数(如果你对使用更先进的技术增强这个训练函数感兴趣,例如学习率预热、余弦退火和梯度裁剪,请参考[Appendix D](../../appendix-D/03_main-chapter-code))\n",
|
||
"\n",
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/train-steps.webp\" width=300px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 30,
|
||
"id": "Mtp4gY0ZO-qq",
|
||
"metadata": {
|
||
"id": "Mtp4gY0ZO-qq"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"def train_model_simple(model, train_loader, val_loader, optimizer, device, num_epochs,\n",
|
||
" eval_freq, eval_iter, start_context):\n",
|
||
" # 初始化列表以跟踪损失和已观察到的token\n",
|
||
" train_losses, val_losses, track_tokens_seen = [], [], []\n",
|
||
" tokens_seen, global_step = 0, -1\n",
|
||
"\n",
|
||
" # 主要的训练步骤\n",
|
||
" for epoch in range(num_epochs):\n",
|
||
" model.train() # 将模型设置为训练模式\n",
|
||
" \n",
|
||
" for input_batch, target_batch in train_loader:\n",
|
||
" optimizer.zero_grad() # 每个epoch开始之前重新设置梯度\n",
|
||
" loss = calc_loss_batch(input_batch, target_batch, model, device)\n",
|
||
" loss.backward() # 计算损失梯度\n",
|
||
" optimizer.step() # 利用损失梯度更新模型参数\n",
|
||
" tokens_seen += input_batch.numel()\n",
|
||
" global_step += 1\n",
|
||
"\n",
|
||
" # 可选的验证评估步骤\n",
|
||
" if global_step % eval_freq == 0:\n",
|
||
" train_loss, val_loss = evaluate_model(\n",
|
||
" model, train_loader, val_loader, device, eval_iter)\n",
|
||
" train_losses.append(train_loss)\n",
|
||
" val_losses.append(val_loss)\n",
|
||
" track_tokens_seen.append(tokens_seen)\n",
|
||
" print(f\"Ep {epoch+1} (Step {global_step:06d}): \"\n",
|
||
" f\"Train loss {train_loss:.3f}, Val loss {val_loss:.3f}\")\n",
|
||
"\n",
|
||
" # 在每个epoch完成后打印一个生成的文本示例\n",
|
||
" generate_and_print_sample(\n",
|
||
" model, train_loader.dataset.tokenizer, device, start_context\n",
|
||
" )\n",
|
||
"\n",
|
||
" return train_losses, val_losses, track_tokens_seen\n",
|
||
"\n",
|
||
"\n",
|
||
"def evaluate_model(model, train_loader, val_loader, device, eval_iter):\n",
|
||
" model.eval()\n",
|
||
" with torch.no_grad():\n",
|
||
" train_loss = calc_loss_loader(train_loader, model, device, num_batches=eval_iter)\n",
|
||
" val_loss = calc_loss_loader(val_loader, model, device, num_batches=eval_iter)\n",
|
||
" model.train()\n",
|
||
" return train_loss, val_loss\n",
|
||
"\n",
|
||
"\n",
|
||
"def generate_and_print_sample(model, tokenizer, device, start_context):\n",
|
||
" model.eval()\n",
|
||
" context_size = model.pos_emb.weight.shape[0]\n",
|
||
" encoded = text_to_token_ids(start_context, tokenizer).to(device)\n",
|
||
" with torch.no_grad():\n",
|
||
" token_ids = generate_text_simple(\n",
|
||
" model=model, idx=encoded,\n",
|
||
" max_new_tokens=50, context_size=context_size\n",
|
||
" )\n",
|
||
" decoded_text = token_ids_to_text(token_ids, tokenizer)\n",
|
||
" print(decoded_text.replace(\"\\n\", \" \")) # 简洁的打印格式\n",
|
||
" model.train()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a301b333-b9d4-4eeb-a212-3a9874e3ac47",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 现在,让我们使用上面定义的训练函数来训练大型语言模型(LLM):"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"id": "3422000b-7aa2-485b-92df-99372cd22311",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/"
|
||
},
|
||
"id": "3422000b-7aa2-485b-92df-99372cd22311",
|
||
"outputId": "0e046603-908d-4093-8ae5-ef2f632639fb"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Ep 1 (Step 000000): Train loss 9.657, Val loss 9.845\n",
|
||
"Ep 1 (Step 000005): Train loss 7.690, Val loss 8.041\n",
|
||
"Every effort moves you,. \n",
|
||
"Ep 2 (Step 000010): Train loss 6.532, Val loss 6.812\n",
|
||
"Ep 2 (Step 000015): Train loss 5.920, Val loss 6.577\n",
|
||
"Every effort moves you, and, and, and, and, and, and, and. \", and,, and, and, and, and, and, and, and,, and, and,, and, and,, and,, and\n",
|
||
"Ep 3 (Step 000020): Train loss 5.777, Val loss 6.494\n",
|
||
"Ep 3 (Step 000025): Train loss 5.692, Val loss 6.505\n",
|
||
"Every effort moves you. \n",
|
||
"Ep 4 (Step 000030): Train loss 5.528, Val loss 6.503\n",
|
||
"Ep 4 (Step 000035): Train loss 5.365, Val loss 6.457\n",
|
||
"Every effort moves you, and, and the \", and, and, and, and, and, and, and, and, and, and, and, and, and, and, and, and, and, and, and, and, and, and\n",
|
||
"Ep 5 (Step 000040): Train loss 4.939, Val loss 6.452\n",
|
||
"Every effort moves you, and in the picture. I was his \" the picture. \n",
|
||
"Ep 6 (Step 000045): Train loss 4.555, Val loss 6.462\n",
|
||
"Ep 6 (Step 000050): Train loss 4.257, Val loss 6.317\n",
|
||
"Every effort moves you, and he had been the picture of the \n",
|
||
"Ep 7 (Step 000055): Train loss 3.721, Val loss 6.242\n",
|
||
"Ep 7 (Step 000060): Train loss 3.275, Val loss 6.176\n",
|
||
"Every effort moves you know it was his pictures--I glanced after him, and I felt. \"I he was his pictures--I had been the sketch of the donkey, and I had always, I\n",
|
||
"Ep 8 (Step 000065): Train loss 2.825, Val loss 6.177\n",
|
||
"Ep 8 (Step 000070): Train loss 2.425, Val loss 6.157\n",
|
||
"Every effort moves you know the fact, and I felt--I had been-chairs forward. \"I turned back the head to me--and the honour, the donkey, and I had a little of\n",
|
||
"Ep 9 (Step 000075): Train loss 2.110, Val loss 6.218\n",
|
||
"Ep 9 (Step 000080): Train loss 1.517, Val loss 6.238\n",
|
||
"Every effort moves you know,\" was not that my hostess was not the fact that the last word. Gisburn's an! \"--and it, the donkey, and it, and I had\n",
|
||
"Ep 10 (Step 000085): Train loss 1.262, Val loss 6.305\n",
|
||
"Every effort moves you?\" \"Yes--quite insensible to the irony. Gisburn's it was no great, in fact, becoming the man of the moment--as Jack himself, as once one had to wander up and down the room, when I\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"torch.manual_seed(123)\n",
|
||
"model = GPTModel(GPT_CONFIG_124M)\n",
|
||
"model.to(device)\n",
|
||
"optimizer = torch.optim.AdamW(model.parameters(), lr=5e-4, weight_decay=0.1)\n",
|
||
"\n",
|
||
"num_epochs = 10\n",
|
||
"train_losses, val_losses, tokens_seen = train_model_simple(\n",
|
||
" model, train_loader, val_loader, optimizer, device,\n",
|
||
" num_epochs=num_epochs, eval_freq=5, eval_iter=5,\n",
|
||
" start_context=\"Every effort moves you\",\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 32,
|
||
"id": "0WSRu2i0iHJE",
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 487
|
||
},
|
||
"id": "0WSRu2i0iHJE",
|
||
"outputId": "9d36c61b-517d-4f07-a7e8-4563aff78b11"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAnYAAAHWCAYAAAD6oMSKAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAABwgElEQVR4nO3deVxU9f7H8dcMO8iiKFuI+4b7nvu+pZZaamW2Z+XeXtcy61d5rbTNsux2rVuaZmpZuZv7vqG4Wy6ggruAICBwfn8MDOCWKHJgeD8fj3kw8z1nznzgULz9nvP9fi2GYRiIiIiISJFnNbsAEREREckfCnYiIiIiDkLBTkRERMRBKNiJiIiIOAgFOxEREREHoWAnIiIi4iAU7EREREQchIKdiIiIiINQsBMRERFxEAp2IuLQLBYLv/zyi9lliIgUCAU7ESnULBbLdR+PPvqo2SWKiBQazmYXICJyPTExMfbnM2bMYPTo0ezbt8/e5uHhYUZZIiKFknrsRKRQCwoKsj98fX2xWCy52qZNm0alSpVwdXWlWrVqfP/999c93ttvv01gYCAREREArF27ltatW+Ph4UHZsmUZPnw4iYmJ9v3Lly/Pe++9x+OPP463tzdhYWFMnjzZvj01NZWhQ4cSHByMu7s75cuXZ+zYsdf8/OXLl9OkSRO8vLzw8/OjRYsWHDlyxL79t99+o2HDhri7u1OxYkXeeust0tLS7Nvj4uIYNGgQAQEB+Pj40L59e7Zv327fPmbMGOrVq8f3339P+fLl8fX15f777ychIeGGf+YiUnQp2IlIkTVnzhxGjBjBCy+8wM6dO3n66ad57LHHWLZs2RX7GobBiBEj+Oabb1i9ejX16tUjMjKSLl260KdPH3bs2MGMGTNYvXo1Q4cOzfXe8ePH06hRI7Zt28bgwYN59tln2bt3LwCffvopc+fO5aeffmLfvn388MMPlC9f/qr1pqWl0atXL9q0acOOHTtYt24dgwYNwmKxALBw4UIeeughhg8fzu7du/nqq6/49ttveffdd+3fQ/fu3YmNjWXevHls2bKFBg0a0KFDB86ePWv/nL///ptffvmF33//nd9//50VK1bw73//Oz9+5CJS2BkiIkXElClTDF9fX/vr5s2bG0899VSuffr27Wvcdddd9teAMXPmTOOhhx4yqlevbkRHR9u3DRw40Bg0aFCu969atcqwWq3GxYsXDcMwjHLlyhkPPfSQfXtGRoYREBBgTJo0yTAMwxg2bJjRvn17IyMj4x/rP3PmjAEYy5cvv+r2Vq1aGe+9916utu+//94IDg42DMMwli5davj4+BjJycm59qlUqZLx1VdfGYZhGG+++abh6elpxMfH27e/9NJLRtOmTf+xPhEp+nSPnYgUWXv27GHQoEG52lq0aMEnn3ySq+25557Dzc2N9evXU7p0aXv7li1b+Ouvv5g6daq9zTAMMjIyOHToEDVq1ACgTp069u1Zl4JPnjwJwKOPPkqnTp2oVq0aXbt2pUePHnTu3Pmq9ZYqVYpHH32ULl260KlTJzp27Ei/fv0IDg6217Np0yZ7Dx1Aeno6ycnJJCUlsWXLFi5cuIC/v3+u4168eJG///7b/rp8+fJ4e3vbXwcHB9vrFRHHpmAnIkVa1mXMLIZhXNHWqVMnfvzxRxYuXMiAAQPs7RkZGTz99NMMHz78iuOGhYXZn7u4uFzxmRkZGQA0aNCAQ4cOMX/+fJYsWUK/fv3o2LEjP//881XrnTJlCsOHD2fBggXMmDGD119/ncWLF3PnnXeSkZHBW2+9RZ8+fa54n7u7OxkZGQQHB7N8+fIrtvv5+d1QvSLi2BTsRKTIqlGjBqtXr+bhhx+2t61du9be05bl7rvvpmfPnjz44IM4OTlx//33A7ZQtmvXLipXrnxLdfj4+NC/f3/69+/PfffdR9euXTl79iylSpW66v7169enfv36vPbaazRr1oxp06Zx55130qBBA/bt23fNeho0aEBsbCzOzs7XvI9PRIo3BTsRKbJeeukl+vXrZx9A8NtvvzF79myWLFlyxb69e/fm+++/Z+DAgTg7O3PffffxyiuvcOeddzJkyBCeeuopvLy82LNnD4sXL+azzz67oRo++ugjgoODqVevHlarlZkzZxIUFJSrBy3LoUOHmDx5MnfffTchISHs27eP/fv324Pp6NGj6dGjB2XLlqVv375YrVZ27NhBZGQk77zzDh07dqRZs2b06tWLcePGUa1aNY4fP868efPo1asXjRo1uqWfp4gUfQp2IlJk9erVi08++YQPPviA4cOHU6FCBaZMmULbtm2vuv99991HRkYGAwcOxGq10qdPH1asWMGoUaNo1aoVhmFQqVIl+vfvf8M1lChRgnHjxnHgwAGcnJxo3Lgx8+bNw2q9ctIBT09P9u7dy3fffceZM2cIDg5m6NChPP300wB06dKF33//nbfffpv3338fFxcXqlevzpNPPgnYLqnOmzePUaNG8fjjj3Pq1CmCgoJo3bo1gYGBef8BiojDsRiGYZhdhIiIiIjcOs1jJyIiIuIgFOxEREREHISCnYiIiIiDULATERERcRAKdiIiIiIOQsFORERExEEo2OXBF198QYUKFXB3d6dhw4asWrXK7JKKrZUrV9KzZ09CQkKwWCz88ssvubYbhsGYMWMICQnBw8ODtm3bsmvXrlz7pKSkMGzYMEqXLo2Xlxd33303R48ezbXPuXPnGDhwIL6+vvj6+jJw4EDOnz+fa5+oqCh69uyJl5cXpUuXZvjw4aSmpt6Ob9uhjR07lsaNG+Pt7U1AQAC9evVi3759ufbReS1aJk2aRJ06dfDx8cHHx4dmzZoxf/58+3adz6Jv7NixWCwWRo4caW/TeTWZITdk+vTphouLi/H1118bu3fvNkaMGGF4eXkZR44cMbu0YmnevHnGqFGjjFmzZhmAMWfOnFzb//3vfxve3t7GrFmzjMjISKN///5GcHCwER8fb9/nmWeeMe644w5j8eLFxtatW4127doZdevWNdLS0uz7dO3a1ahVq5axdu1aY+3atUatWrWMHj162LenpaUZtWrVMtq1a2ds3brVWLx4sRESEmIMHTr0tv8MHE2XLl2MKVOmGDt37jQiIiKM7t27G2FhYcaFCxfs++i8Fi1z5841/vjjD2Pfvn3Gvn37jH/961+Gi4uLsXPnTsMwdD6Luo0bNxrly5c36tSpY4wYMcLervNqLgW7G9SkSRPjmWeeydVWvXp149VXXzWpIslyebDLyMgwgoKCjH//+9/2tuTkZMPX19f48ssvDcMwjPPnzxsuLi7G9OnT7fscO3bMsFqtxoIFCwzDMIzdu3cbgLF+/Xr7PuvWrTMAY+/evYZh2AKm1Wo1jh07Zt/nxx9/NNzc3Iy4uLjb8v0WFydPnjQAY8WKFYZh6Lw6ipIlSxr/+c9/dD6LuISEBKNKlSrG4sWLjTZt2tiDnc6r+XQp9gakpqayZcsWOnfunKu9c+fOrF271qSq5FoOHTpEbGxsrvPl5uZGmzZt7Odry5YtXLp0Kdc+ISEh1KpVy77PunXr8PX1pWnTpvZ97rzzTnx9fXPtU6tWLUJCQuz7dOnShZSUFLZs2XJbv09HFxcXB0CpUqUAndeiLj09nenTp5OYmEizZs10Pou4IUOG0L17dzp27JirXefVfFor9gacPn2a9PT0K9ZiDAwMJDY21qSq5FqyzsnVzteRI0fs+7i6ulKyZMkr9sl6f2xsLAEBAVccPyAgINc+l39OyZIlcXV11e/GLTAMg+eff56WLVtSq1YtQOe1qIqMjKRZs2YkJydTokQJ5syZQ3h4uP2Ps85n0TN9+nS2bNnC5s2br9im/07Np2CXBxaLJddrwzCuaJPC42bO1+X7XG3/m9lH8mbo0KHs2LGD1atXX7FN57VoqVatGhEREZw/f55Zs2bxyCOPsGLFCvt2nc+iJTo6mhEjRrBo0SLc3d2vuZ/Oq3l0KfYGlC5dGicnpyv+BXDy5Mkr/rUg5gsKCgK47vkKCgoiNTWVc+fOXXefEydOXHH8U6dO5drn8s85d+4cly5d0u/GTRo2bBhz585l2bJlhIaG2tt1XosmV1dXKleuTKNGjRg7dix169blk08+0fksorZs2cLJkydp2LAhzs7OODs7s2LFCj799FOcnZ3tP0+dV/Mo2N0AV1dXGjZsyOLFi3O1L168mObNm5tUlVxLhQoVCAoKynW+UlNTWbFihf18NWzYEBcXl1z7xMTEsHPnTvs+zZo1Iy4ujo0bN9r32bBhA3Fxcbn22blzJzExMfZ9Fi1ahJubGw0bNryt36ejMQyDoUOHMnv2bP78808qVKiQa7vOq2MwDIOUlBSdzyKqQ4cOREZGEhERYX80atSIAQMGEBERQcWKFXVezVawYzWKrqzpTr755htj9+7dxsiRIw0vLy/j8OHDZpdWLCUkJBjbtm0ztm3bZgDGhAkTjG3bttmnn/n3v/9t+Pr6GrNnzzYiIyONBx544KrD7UNDQ40lS5YYW7duNdq3b3/V4fZ16tQx1q1bZ6xbt86oXbv2VYfbd+jQwdi6dauxZMkSIzQ0tNgPt78Zzz77rOHr62ssX77ciImJsT+SkpLs++i8Fi2vvfaasXLlSuPQoUPGjh07jH/961+G1Wo1Fi1aZBiGzqejyDkq1jB0Xs2mYJcHn3/+uVGuXDnD1dXVaNCggX0aBil4y5YtM4ArHo888ohhGLYh92+++aYRFBRkuLm5Ga1btzYiIyNzHePixYvG0KFDjVKlShkeHh5Gjx49jKioqFz7nDlzxhgwYIDh7e1teHt7GwMGDDDOnTuXa58jR44Y3bt3Nzw8PIxSpUoZQ4cONZKTk2/nt++QrnY+AWPKlCn2fXRei5bHH3/c/v/MMmXKGB06dLCHOsPQ+XQUlwc7nVdzWQzDMMzpKxQRERGR/KR77EREREQchIKdiIiIiINQsBMRERFxEAp2IiIiIg5CwU5ERETEQSjYiYiIiDgIBbs8SklJYcyYMaSkpJhdiuQTnVPHpPPqeHROHZPOa/7SPHZ5FB8fj6+vL3Fxcfj4+JhdjuQDnVPHpPPqeHROHZPOa/5Sj52IiIiIg1CwExEREXEQzmYXcLulpaWxbds2AgMDsVpvPccmJCQAcOzYMeLj42/5eGI+nVPHpPPqeHROHZPO6z/LyMjgxIkT1K9fH2fn60c3h7/HbtOmTTRp0sTsMkRERERuycaNG2ncuPF193H4HrvAwEDA9sMIDg42uRoRERGRvImJiaFJkyb2THM9Dh/ssi6/BgcHExoaanI1IiIiIjfnRm4p0+AJEREREQehYCciIiLiIBTsRERERByEqffYrVy5kg8++IAtW7YQExPDnDlz6NWrl327YRi89dZbTJ48mXPnztG0aVM+//xzatasaV7RIiIimdLT07l06ZLZZUgR5+LigpOTU74cy9Rgl5iYSN26dXnssce49957r9j+/vvvM2HCBL799luqVq3KO++8Q6dOndi3bx/e3t4mVCwiImLreIiNjeX8+fNmlyIOws/Pj6CgICwWyy0dx9Rg161bN7p163bVbYZh8PHHHzNq1Cj69OkDwHfffUdgYCDTpk3j6aefLshSRURE7LJCXUBAAJ6enrf8x1iKL8MwSEpK4uTJkwC3PDVboZ3u5NChQ8TGxtK5c2d7m5ubG23atGHt2rUKdiIiYor09HR7qPP39ze7HHEAHh4eAJw8eZKAgIBbuixbaINdbGwswBWT8QUGBnLkyJFrvi8lJYWUlBT766ylSkRERPJD1j11np6eJlcijiTr9+nSpUu3FOwK/ajYy7u3DcO4bpf32LFj8fX1tT/Cw8Nvd4kiIlIM6fKr5Kf8+n0qtMEuKCgIyO65y3Ly5MnrLqnx2muvERcXZ3/s3r37ttYpIiIiUlgU2mBXoUIFgoKCWLx4sb0tNTWVFStW0Lx582u+z83NDR8fH/tDo2dFRERun7Zt2zJy5Mgb3v/w4cNYLBYiIiJuW00Ay5cvx2KxFLuRy6beY3fhwgX++usv++tDhw4RERFBqVKlCAsLY+TIkbz33ntUqVKFKlWq8N577+Hp6cmDDz5oYtUiIiJFzz9d6nvkkUf49ttv83zc2bNn4+LicsP7ly1blpiYGEqXLp3nz5J/Zmqw27x5M+3atbO/fv7554HsX66XX36ZixcvMnjwYPsExYsWLVIvnIiISB7FxMTYn8+YMYPRo0ezb98+e1vWyMwsly5duqHAVqpUqTzV4eTkZL/dSvKfqZdi27Zti2EYVzyy/sVgsVgYM2YMMTExJCcns2LFCmrVqmVmydd28TwcWml2FSIiIlcVFBRkf/j6+mKxWOyvk5OT8fPz46effqJt27a4u7vzww8/cObMGR544AFCQ0Px9PSkdu3a/Pjjj7mOe/ml2PLly/Pee+/x+OOP4+3tTVhYGJMnT7Zvv/xSbNYl06VLl9KoUSM8PT1p3rx5rtAJ8M477xAQEIC3tzdPPvkkr776KvXq1cvTz2DWrFnUrFkTNzc3ypcvz/jx43Nt/+KLL6hSpQru7u4EBgZy33332bf9/PPP1K5dGw8PD/z9/enYsSOJiYl5+vyCUGjvsStSTu2DCTXgxwchRdOriIgUN4ZhkJSaZsrDMIx8+z5eeeUVhg8fzp49e+jSpQvJyck0bNiQ33//nZ07dzJo0CAGDhzIhg0brnuc8ePH06hRI7Zt28bgwYN59tln2bt373XfM2rUKMaPH8/mzZtxdnbm8ccft2+bOnUq7777LuPGjWPLli2EhYUxadKkPH1vW7ZsoV+/ftx///1ERkYyZswY3njjDXtn0ubNmxk+fDhvv/02+/btY8GCBbRu3Rqw9XY+8MADPP744+zZs4fly5fTp0+ffP3Z55dCO49dkeJfBXzugDMHIOJHaDrI7IpERKQAXbyUTvjohaZ89u63u+Dpmj9/zkeOHGlf7SnLiy++aH8+bNgwFixYwMyZM2natOk1j3PXXXcxePBgwBYWP/roI5YvX0716tWv+Z53332XNm3aAPDqq6/SvXt3kpOTcXd357PPPuOJJ57gscceA2D06NEsWrSICxcu3PD3NmHCBDp06MAbb7wBQNWqVdm9ezcffPABjz76KFFRUXh5edGjRw+8vb0pV64c9evXB2zBLi0tjT59+lCuXDkAateufcOfXZDUY5cfrFZomrkSxsavICPD3HpERERuQqNGjXK9Tk9P591336VOnTr4+/tTokQJFi1aRFRU1HWPU6dOHfvzrEu+WUtm3ch7spbVynrPvn37aNKkSa79L3/9T/bs2UOLFi1ytbVo0YIDBw6Qnp5Op06dKFeuHBUrVmTgwIFMnTqVpKQkAOrWrUuHDh2oXbs2ffv25euvv+bcuXN5+vyCoh67/FL3flj6Npz5C/7+E6p0NLsiEREpIB4uTux+u4tpn51fvLy8cr0eP348H330ER9//DG1a9fGy8uLkSNHkpqaet3jXD7owmKxkPEPnR4535M1gjfne662YEFeXG2Bg5zH8Pb2ZuvWrSxfvpxFixYxevRoxowZw6ZNm/Dz82Px4sWsXbuWRYsW8dlnnzFq1Cg2bNhAhQoV8lTH7aYeu/zi5g31B9qeb/jS3FpERKRAWSwWPF2dTXnczhUwVq1axT333MNDDz1E3bp1qVixIgcOHLhtn3ct1apVY+PGjbnaNm/enKdjhIeHs3r16lxta9eupWrVqvYlvJydnenYsSPvv/8+O3bs4PDhw/z555+A7Ry3aNGCt956i23btuHq6sqcOXNu4bu6PdRjl08MwyCt4RO4rP8C/loMp/+C0pXNLktEROSmVa5cmVmzZrF27VpKlizJhAkTiI2NpUaNGgVax7Bhw3jqqado1KgRzZs3Z8aMGezYsYOKFSve8DFeeOEFGjduzP/93//Rv39/1q1bx8SJE/niiy8A+P333zl48CCtW7emZMmSzJs3j4yMDKpVq8aGDRtYunQpnTt3JiAggA0bNnDq1KkC/zncCPXY5YO524/T6aOVfLfXAlW72ho3Tr7+m0RERAq5N954gwYNGtClSxfatm1LUFAQvXr1KvA6BgwYwGuvvcaLL75IgwYNOHToEI8++iju7u43fIwGDRrw008/MX36dGrVqsXo0aN5++23efTRRwHw8/Nj9uzZtG/fnho1avDll1/y448/UrNmTXx8fFi5ciV33XUXVatW5fXXX2f8+PF069btNn3HN89iFMaxuvno6NGjlC1blujoaEJDQ2/LZ0zdcIRRc3ZSqYwXS3oZWL7vBa4l4Pk94O5zWz5TRETMkZyczKFDh6hQoUKegoXkr06dOhEUFMT3339vdin54nq/V3nJMuqxywd31w3Bw8WJv08lstlaB0pXg9QLEDHN7NJERESKvKSkJCZMmMCuXbvYu3cvb775JkuWLOGRRx4xu7RCR8EuH3i7u9Czrm1o9o+bojX1iYiISD6yWCzMmzePVq1a0bBhQ3777TdmzZpFx46ageJyCnb55P4mYQDMi4whruq94OYLZw/CX0tMrkxERKRo8/DwYMmSJZw9e5bExES2bt16xUTKYqNgl0/ql/WjWqA3yZcy+HX3eWiQOfXJjumm1iUiIiLFh4JdPrFYLNzfpCwAP26MxmgyCO79BnppTjsREREpGAp2+ah3/TtwdbayJyaeHRd8ofZ94OxqdlkiIiJSTCjY5SM/T1fuqhUEwPRN0dkbMtIhLcWkqkRERKS4ULDLZ/0b2wZRzI04RmJKGmyfDp/Ug03/MbcwERERcXgKdvnszoqlqFDai8TUdH7fcRwuXYS4KIj82ezSRERExMEp2OUzi8VC/8bZgyio0x/u+Rwem2dyZSIiIreubdu2jBw50v66fPnyfPzxx9d9j8Vi4Zdffrnlz86v41zPmDFjqFev3m39jNtJwe42uLdBKM5WCxHR59l7Ng3qPwQuHmaXJSIixVjPnj2vOaHvunXrsFgsbN26Nc/H3bRpE4MGDbrV8nK5VriKiYkplOuzFiYKdrdBGW83OoUHAjB9Y85BFBmQcsGkqkREpDh74okn+PPPPzly5MgV2/773/9Sr149GjRokOfjlilTBk9Pz/wo8R8FBQXh5uZWIJ9VVCnY3SZZK1HM3nqU5EvpcGAJfN4EFr9hcmUiIlIc9ejRg4CAAL799ttc7UlJScyYMYMnnniCM2fO8MADDxAaGoqnpye1a9fmxx9/vO5xL78Ue+DAAVq3bo27uzvh4eEsXrz4ive88sorVK1aFU9PTypWrMgbb7zBpUuXAPj2229566232L59OxaLBYvFYq/58kuxkZGRtG/fHg8PD/z9/Rk0aBAXLmR3oDz66KP06tWLDz/8kODgYPz9/RkyZIj9s25ERkYGb7/9NqGhobi5uVGvXj0WLFhg356amsrQoUMJDg7G3d2d8uXLM3bsWPv2MWPGEBYWhpubGyEhIQwfPvyGP/tmON/WoxdjrSqX5g4/D46dv8j8nTH0LukOZw5A/DHoMBo8SppdooiI5LfUxLy/x8kNnDL/HKenQXoKWKy5b+G51nFdvW74Y5ydnXn44Yf59ttvGT16NBaLBYCZM2eSmprKgAEDSEpKomHDhrzyyiv4+Pjwxx9/MHDgQCpWrEjTpk3/8TMyMjLo06cPpUuXZv369cTHx+e6Hy+Lt7c33377LSEhIURGRvLUU0/h7e3Nyy+/TP/+/dm5cycLFixgyRLbspy+vr5XHCMpKYmuXbty5513smnTJk6ePMmTTz7J0KFDc4XXZcuWERwczLJly/jrr7/o378/9erV46mnnrqhn9snn3zC+PHj+eqrr6hfvz7//e9/ufvuu9m1axdVqlTh008/Ze7cufz000+EhYURHR1NdLTtat3PP//MRx99xPTp06lZsyaxsbFs3779hj73ZinY3SZWq20QxYTF+/lxYzS9B7WAwFpwYids+wGaDzO7RBERyW/vheT9PX2/hZq9bc/3/gYzH4VyLeGxP7L3+bg2JJ258r1j4vL0UY8//jgffPABy5cvp127doDtMmyfPn0oWbIkJUuW5MUXX7TvP2zYMBYsWMDMmTNvKNgtWbKEPXv2cPjwYUJDQwF47733rrgv7vXXX7c/L1++PC+88AIzZszg5ZdfxsPDgxIlSuDs7ExQUNA1P2vq1KlcvHiR//3vf3h52QLuxIkT6dmzJ+PGjSMw0HZLVMmSJZk4cSJOTk5Ur16d7t27s3Tp0hsOdh9++CGvvPIK999/PwDjxo1j2bJlfPzxx3z++edERUVRpUoVWrZsicVioVy5cvb3RkVFERQURMeOHXFxcSEsLIwmTZrc0OfeLF2KvY36NgrFaoGNh87y9+lEaPq0bcPGybZJi0VERApQ9erVad68Of/9738B+Pvvv1m1ahWPP/44AOnp6bz77rvUqVMHf39/SpQowaJFi4iKirqh4+/Zs4ewsDB7qANo1qzZFfv9/PPPtGzZkqCgIEqUKMEbb7xxw5+R87Pq1q1rD3UALVq0ICMjg3379tnbatasiZOTk/11cHAwJ0+evKHPiI+P5/jx47Ro0SJXe4sWLdizZw9gu9wbERFBtWrVGD58OIsWLbLv17dvXy5evEjFihV56qmnmDNnDmlpaXn6PvNKPXa3UbCvB+2qBbB070lmbIrmX536wuLRcD4K9s2HGj3MLlFERPLTv47n/T1OOQYDVO9pO4blsn6XkZG3VlcOTzzxBEOHDuXzzz9nypQplCtXjg4dOgAwfvx4PvroIz7++GNq166Nl5cXI0eOJDU19YaObRjGFW1Zl3yzrF+/nvvvv5+33nqLLl264Ovry/Tp0xk/fnyevg/DMK449tU+08XF5YptGRkZefqsyz8n52c3aNCAQ4cOMX/+fJYsWUK/fv3o2LEjP//8M2XLlmXfvn0sXryYJUuWMHjwYD744ANWrFhxRV35RT12t1nWIIpZW46SanGDho/aNmz40ryiRETk9nD1yvvDKUcfi5Ozre3yKbKu9d6b0K9fP5ycnJg2bRrfffcdjz32mD2krFq1invuuYeHHnqIunXrUrFiRQ4cOHDDxw4PDycqKorjx7MD7rp163Lts2bNGsqVK8eoUaNo1KgRVapUuWKkrqurK+np17+yFR4eTkREBImJ2fcfrlmzBqvVStWqVW+45uvx8fEhJCSE1atX52pfu3YtNWrUyLVf//79+frrr5kxYwazZs3i7NmzAHh4eHD33Xfz6aefsnz5ctatW0dkZP4F9csp2N1m7aqVIdDHjTOJqSzefQIaPQEWJzi8Ck7sMrs8EREpZkqUKEH//v3517/+xfHjx3n00Uft2ypXrszixYtZu3Yte/bs4emnnyY2NvaGj92xY0eqVavGww8/zPbt21m1ahWjRo3KtU/lypWJiopi+vTp/P3333z66afMmTMn1z7ly5fn0KFDREREcPr0aVJSrlxvfcCAAbi7u/PII4+wc+dOli1bxrBhwxg4cKD9/rr88NJLLzFu3DhmzJjBvn37ePXVV4mIiGDEiBEA9sERe/fuZf/+/cycOZOgoCD8/Pz49ttv+eabb9i5cycHDx7k+++/x8PDI9d9ePlNwe42c3ay0rehbSWK6ZuiwK9s9iXYDV+ZWJmIiBRXTzzxBOfOnaNjx46EhYXZ29944w0aNGhAly5daNu2LUFBQfTq1euGj2u1WpkzZw4pKSk0adKEJ598knfffTfXPvfccw/PPfccQ4cOpV69eqxdu5Y33sg9Fdi9995L165dadeuHWXKlLnqlCuenp4sXLiQs2fP0rhxY+677z46dOjAxIkT8/bD+AfDhw/nhRde4IUXXqB27dosWLCAuXPnUqVKFcAWlMeNG0ejRo1o3Lgxhw8fZt68eVitVvz8/Pj6669p0aIFderUYenSpfz222/4+/vna405WYyrXRB3IEePHqVs2bJER0fnupmzIEWfTaLV+8sAWPVyO8omRMCUbuDsAc/vBs9SptQlIiJ5l5yczKFDh6hQoQLu7u5mlyMO4nq/V3nJMuqxKwBlS3nSqkppAGZsioawZhBUG9Iuwtb/mVydiIiIOAoFuwLSv7HtcuzMLdGkZRjQ9Bnbhk3/sU1IKSIiInKLFOwKSKfwQEp5uXIiPoVl+05BrfvA0x/iomHfPLPLExEREQegYFdA3JyduLfBHQBM3xgFLu7QYiS0ehFCG5tbnIiIiDgETVBcgPo3DuPrVYdYtu8ksXHJBLW4vQsBi4iISPGiHrsCVDmgBE3KlyLDgJmbo80uR0REbkFeVy8QuZ78+n1Sj10Bu79JWTYePsuMzdEMaVcZqwU4sBg2fwP3fAFet29uGxERuXWurq5YrVaOHz9OmTJlcHV1vebSViL/xDAMUlNTOXXqFFarFVdX11s6noJdAburdjBj5u7i6LmLrP7rNK2rlIZl70JMBGz9Flq9YHaJIiJyHVarlQoVKhATE5Nr6SyRW+Hp6UlYWBhW661dTFWwK2DuLk70rn8H3607wvRNUbSuWgZajoSjm6FmH7PLExGRG+Dq6kpYWBhpaWn/uKapyD9xcnLC2dk5X3p+FexMcH+TML5bd4TFu09w+kIKpWv2hpq9zS5LRETywGKx4OLigouLi9mliNhp8IQJagT7ULesH5fSDWZtOWp2OSIiIuIgFOxM8kDmShQzNkVjX6738BqYPgCOR5hXmIiIiBRZCnYm6Vk3BC9XJw6eTmTDobO2xs3/hb2/w8bJ5hYnIiIiRZKCnUm83Jy5u14IkLkSBWSvHxs5Ey6cMqkyERERKaoU7Ex0f+MwAObtjOV8UiqENoKQBpCeapv6RERERCQPFOxMVCfUlxrBPqSmZTBn2zGwWLJ77TZ9A+mXzC1QREREihQFOxNZLBbuzxxEMX1j5iCKmr3AKwASYmDPXHMLFBERkSJFwc5kverdgZuzlX0nEtgWfR6c3aDR47aNG74ytTYREREpWhTsTObr6UL32sFAjkEUjR4DqwtEb4BjW02sTkRERIoSBbtC4P4mtkEUv22PISH5EngHZa9EoalPRERE5AYp2BUCjcuXpFIZLy5eSmfu9swFpbMGUeycBRdOmleciIiIFBkKdoWAbRCFrddu+sZoW2NoQ7ijkW3qky3fmleciIiIFBkKdoVEnwZ34OJkIfJYHDuPxdka73zW9nXTN5CWal5xIiIiUiQo2BUS/iXc6FwzCLCtHwtAjbvB5w7bxMUXz5lYnYiIiBQFCnaFyAOZl2N/iTjGxdR0cHaFoZvg/qngHWhydSIiIlLYKdgVIs0r+VO2lAcJyWn8ERlja3T1MrcoERERKTIU7AoRqzXnIIqo3BvPHYYdPxV8USIiIlJkKNgVMn0bhuJktbD5yDkOnEiwNZ49CJ/Ug18GQ0KsqfWJiIhI4aVgV8gE+LjTvnoAANOzBlGUqghlm0KFVpAcb2J1IiIiUpgp2BVCDzQpC8DsrUdJSUu3NT78KwycA2WqmliZiIiIFGYKdoVQm6oBBPm4cy7pEgt3nbA1uribW5SIiIgUegp2hZCT1UK/RqHAVQZRxMfAhslgGCZUJiIiIoWZgl0h1a9xWSwWWPv3GY6cSbQ1pibCxEYw/yU4utncAkVERKTQUbArpEJLetKqShkgxyAKVy8Iv8f2fMOXJlUmIiIihZWCXSH2QGPbIIqZm49yKT3D1thkkO3r7l9sl2VFREREMinYFWIdagRSuoQrpy+ksHTPSVtjSD0IawYZabD5v6bWJyIiIoWLgl0h5ups5d6GmYMoNuUYRNH0advXzf+FtBQTKhMREZHCSMGukMtaYmzF/lMcO3/R1li9B/jcAUmnYedsE6sTERGRwqRQB7u0tDRef/11KlSogIeHBxUrVuTtt98mIyPD7NIKTIXSXtxZsRSGAT9lDaJwcoHGT9qeb/hSU5+IiIgIUMiD3bhx4/jyyy+ZOHEie/bs4f333+eDDz7gs88+M7u0AvVAE1uv3czN0aRnZIa4Bo+AszvERED0RvOKExERkUKjUAe7devWcc8999C9e3fKly/PfffdR+fOndm8uXjN4dalZhB+ni4cj0tm5f5TtkYvf6jd1/Z8zSfmFSciIiKFRqEOdi1btmTp0qXs378fgO3bt7N69WruuusukysrWO4uTvSufwdw2SCKZkPA4gT7/oB9C0yqTkRERAqLQh3sXnnlFR544AGqV6+Oi4sL9evXZ+TIkTzwwAPXfE9KSgrx8fH2R0JCQgFWfPtkXY5duuckJxOSbY0BNWzhDuCPFyDlgknViYiISGFQqIPdjBkz+OGHH5g2bRpbt27lu+++48MPP+S777675nvGjh2Lr6+v/REeHl6AFd8+VQO9aRDmR1qGwc9bjmZvaPsq+IXZLs0mnTavQBERETGdxTAK75DKsmXL8uqrrzJkyBB72zvvvMMPP/zA3r17r/qelJQUUlKy53Y7duwY4eHhREdHExoaettrvp1+2hzNyz/voJy/J8teaIvVarFtOHsIfMuCk7O5BYqIiEi+O3r0KGXLlr2hLFOoe+ySkpKwWnOX6OTkdN3pTtzc3PDx8bE/vL29b3eZBaZHnWBKuDlz5EwS6w+eyd5QqoJCnYiIiBTuYNezZ0/effdd/vjjDw4fPsycOXOYMGECvXv3Nrs0U3i6OnN3vRAAfsya0y6nS8nw5zuwflIBVyYiIiKFQaHu5vnss8944403GDx4MCdPniQkJISnn36a0aNHm12aaR5oHMa0DVEs3BnL2cRUSnm5Zm/c+zus/ABcPKFmH/AONK9QERERKXCFOth5e3vz8ccf8/HHH5tdSqFRO9SXmiE+7Doez+ytR3myVcXsjbXuhX3zIfxuKBFgXpEiIiJiikJ9KVau7v7MqU+mb4om19gXiwXu+wbC77E9FxERkWJFwa4IuqdeCB4uTvx18gJbjpy79o6Jp+Hi+QKrS0RERMylYFcE+bi70L1OMAA/brzKIAqAvfNgYmNY/EYBViYiIiJmUrAroh5oUhaAPyKPE3fx0pU7eJSEi2dh6//g8JoCrk5ERETMoGBXRDUIK0mVgBIkX8pgbsSxK3co1wwaPmp7/tsISEu5ch8RERFxKAp2RZTFYrEPovhq5UESU9Ku3KnjGPAKgDMHYNWEgi1QRERECpyCXRF2f+Oy3OHnwdFzFxk7f8+VO3iUhG7jbM9XT4BT+wq2QBERESlQCnZFmJebMx/cVweAH9ZHseav01fuVLM3VOkM6anw20i4znJsIiIiUrQp2BVxzSuXZuCd5QB4+ecdXLj8kqzFAt3H21ajiFoL2743oUoREREpCAp2DuDVbtUpW8qDY+cv8t68q1yS9QuD9q/bni9+AxJOFGyBIiIiUiAU7ByAl5sz799bF4BpG6JYdeDUlTs1eRqC60JyHCx8rYArFBERkYKgYOcgmlXy55Fmtkuyr/y8g4Tky+a2c3KGnp+CxQo7Z8H+RSZUKSIiIreTgp0DeaVbdcJKeXI8Lpl3/7jKJdmQenDnYNvzP16A1KQCrU9ERERuLwU7B+Lpmj1KdvqmaFbsv8ol2bavQUgD6DAaXDwKuEIRERG5nRTsHEzTiv481qI8AK/O2kH85Zdk3UrAU39Cnb62EbMiIiLiMBTsHNDLXapT3t+TmLhk3vl995U75Ax0F89D+lVWrRAREZEiR8HOAXm4OvFB37pYLPDT5qMs23vy6jvu+Q0mNoaNkwu2QBEREbktFOwcVOPypXi8RQUAXp29g7ikS1fulHQGEk9C5E9akUJERMQBKNg5sBc7V6NiaS9OxKfw9tUuydZ/2DYFyuMLwapfBRERkaJOf80dmO2SbB0sFpi19ShL91y24oTVCg0fAWc3cwoUERGRfKVg5+AalivFU60qAvDa7MirX5IFSL8E676wDaYQERGRIknBrhh4vlNVKpbx4mRCCm/9tuvqO8160rbU2JIxBVqbiIiI5B8Fu2LA3cWJD/vWxWqB2duOsXj3iSt3avKU7euWKXBkXcEWKCIiIvlCwa6YaBBWkqda2y7J/mtOJOcSU3PvUL4l1B9oe/7bCEhLKeAKRURE5FYp2BUjz3WsSuWAEpxKSGHM1S7JdnobvMrA6X2w5pOCL1BERERuiYJdMZLzkuyvEcdZsDM29w6epaDrv23PV34Apw8UfJEiIiJy0xTsipl6Zf14pk0lAF7/JZKzl1+SrXUvVOoA6anw+3NgGCZUKSIiIjdDwa4YGtGxClUDS3D6Qipvzr3skqzFAj0mgLMHHF4FEVPNKVJERETyTMGuGHJztl2SdbJa+G37ceZHxuTeoWR5aPea7fnCUXDhVIHXKCIiInmnYFdM1Qn141n7JdmdnLlw2SjYO4dAUG1IPg8L/1XwBYqIiEieKdgVY8M6VKZ6kDdnElMZ/etll2SdnKHnJ2CxQuRP8NcSc4oUERGRG6ZgV4zlvCT7R2QMv+84nnuHOxpCk6dtz9dOLPgCRUREJE8U7Iq5Wnf4MqRdZQBG/7qL05dfkm0/Ctq/Dg/8aEJ1IiIikhcKdsLQdpWpEezD2cRU3vhlJ0bOKU7cvKH1S+DiYV6BIiIickMU7ARXZysf9q2Ds9XC/J2x/LYj5uo7ZqTD9hm2ryIiIlLoKNgJADVDfBnaPuuS7E5OJiTn3sEw4Id7Yc4g2PQfEyoUERGRf6JgJ3ZD2lUmPNiH80mXeH3OZZdkLRao0RPcfMG1hHlFioiIyDUp2Imdi5OV8f3q4uJkYdHuE8zdftko2YaPwbDNUH+AOQWKiIjIdSnYSS41gn0Y3r4KYBslezI+xyVZqxVKBGS/jo+BjIwCrlBERESuRcFOrvBM20rUusOHuIuX+NecyNyXZLPsXwTfdILx1eDXIbD3D0hNLPhiRURExE7BTq7g4mRlfN96uDhZWLLnJHO2Hcu9Q/ol2DcPkuMg8SRs+wGmPwjjKsDUfrD5vxB//OoHFxERkdvGYly1O8ZxHD16lLJlyxIdHU1oaKjZ5RQpny/7iw8W7sPH3ZnFz7ch0Mc99w5pqXBkDexfYAt656Nybw+uB9XugmpdIaiObQCGiIiI5ElesoyCnVxTWnoGfSatZcfRONpXD+CbRxphuVY4Mww4uQf2z4d98+HoZiDHr5bPHdB8GNz5bIHULiIi4ijykmV0KVauydnJyvi+dXF1svLn3pPM2nrs2jtbLBAYDq1egCeXwIv74e6JUL0HuHhC/DHbJdwsSWdh21RIPH37vxEREZFiQsFOrqtKoDfPdaoKwFu/7SI2Lvkf3pGpRAA0GAj3T4WXD8KDM6HWvdnb9y+EXwfD973yv2gREZFiSsFO/tFTrSpQt6wfCclpvDp7x9VHyV6PiwdU7Qy+d+Roc4fgulC1W3ZbaiJ8ficseA0OrsjdwyciIiL/yNnsAqTws12SrcNdn65m+b5TzNx8lH6Ny97aQWv2tj1yzoP39zI4tcf2WP+FbZWLKh1tAzAqdwCPkrf2mSIiIg5OPXZyQyoHePNC5iXZ//t9N8fPX8yfA1tz/ApWbAv9f4B6D4FnaUiJg52zYNYT8H4l+LYHrPscjm7JnBw5PX9qEBERcRDqsZMb9mSriizYFcu2qPO8OjuS7x5rfO1RsjfDrYRtPdoaPW2h7dgW2zQq+xbYevEOr7I9slicbGFw4Ozsti3f2QZrVOkEHn75V5uIiEgRoGAnN8zJauHDvnW565NVrNx/imkboxjQtNzt+TCrE5RtYnt0HANnD9nmy9u/AE4fgIRYMNLBetmv8MJRkJoAQzdnB7s1n8KOn8AnGHxCwDvE9tUn2DYNi3cwuPtqnj0RESnyFOwkTyqVKcFLXarxzh97eP2XnZxLTGVw28pYrbc5FJWqYJsDL2sevIx0uHASMnIMsEi/BNW62Va98A7Obj+9H05E2h7X4uKVO/iFNoImT2VvTzwDrp7g7J4dAC+eh9QLkJFmqycjPfN5WnabcVmbu5/t2Fl2/wqXLtrqdve1tUVvgqObst9jZN6HaLEAlqt/9fSHeg9mH3f7dLh4DmrcnT1o5cQuOLL22se5/DOsLrZwXLVL9nFTk8DZzRa8RUSk0FGwkzx7rEUFDp5OZNqGKD5ctJ/tR+MY368uPu4uBVeE1ckWxHJycoF7v75y31YvQPg9trn04mMg4bgt/MXH2NqSz8OlRDjzl+0BtrasYGcY8HFt2z4vHQQvf1v70rdsy6flRflW8Ojv2a9/GwkXz8LgDdnB7q/FsGJc3o5bpnruYLdqApzeB4G1soPd4TUw/6W8HdejJLxyOPv1j/3h0Eq49xuofZ+t7cg6WDUe3H3Azcf2feR87uZje53zuat37vsrRUQkXyjYSZ45WS2817s2de7wZfSvu1i8+wS9Jq7hq4ENqRLobXZ5VypVwfa4ltQkSIjJDHvHbcHPLyzH9gtwKcn2PCMtu93qkvlwznw4XfY85+vMtsvrqNAKUi7Ypn/JElgTavfNfo/FaguXGJmLeRg5Xmd+9b4s5FbpZDuOV5ncP4caPTPfk+ny49i3GZCeauvJzCklwfbVLcd5PnfYFkbzwuoMb5zO7v38812I2Q53PgOV2tvaTuyGLVNs3z8W21dLzp7GrNeXbW/9ki3kA+ydZ7s/s0JbCG1oa0uIhcifM/fPeQxsvaO5elvTs583H5b9fe/5HQ6vtt3jWa1r5nFPwJIxOd6Xlvk8I0fPbfqVbb2/Av9KtmMknbV99SyVt5+niEgmLSkmt2R79Hme+WELMXHJeLk68WHfunSrHfzPbyxq0i9BWjK4lije9+KlpUJKPLh62eYnBDh70HaJNzneti05HpLjbKOac7alZLanp9ouSb96JPu4/7sHDi6H3pOhbn9b2955MP2BvNf4+knb5WKAWU9B5E/Q+V1oPtTWFr0RvumU9+M+tzu793PBa7YpeVo+Z7sHFODM3/BZg7wfd9AKCKlne75qgq0nuNHj0OMjW1tGBpw7BCUrqJdTpJjKS5ZRj53ckrpl/fhtWEuGTdvGuoNneHbqVp5uU5GXOlfD2cmB/gg5uWT3AhVnzq7gXDp3W6mKtseNupRsm4w6p5bP21YmKds4u82/kq33zTAy7zPM/Gp/zWWvM79actz/V76lLeQF1Mhu8/SH2v1yHC/HMaxOtvfn7C3NassKsmDrqXPxgHItch+341vZPbUWJ1sQszhd1pZ13MzPKFk++xgJMbavOXuMzx6EiQ1t/6gIrAlBtW2X2IPq2L4vV88b/9mLiMNTj53ki7T0DN5fuI/JKw8C0LJyaT59oD6lvFxNrkykiEmOt4XMrFHdfy2F6Q/aeowvZ7GCf2Vb2AuqDYGZX70DC7RkEbm98pJlFOwkX/2+4zgv/7yDpNR07vDz4MuHGlI71NfsskSKtvQ028Ce2MzR3bGZj8RTV9+/RCCM2JF97+aFU7aBME66SCNSFOlSrJimR50QqgR488wPWzh0OpF7v1zLu71q0bfRLS5BJlKcOTlDQHXbg77Z7QknMkPeDjix0/b89AHbJN05B+TMfBSObYa+39qm1oHM+x4ToESAbjMQcSAKdpLvqgV588uQFrzwUwRL9pzkpZ93sP3oeUb3qImrswPddydiNu9A26NKx+y2rFHeWQzDNvgiLRl8c/wDK2IaLHgFyJwH0TvI1tN3va857zMUKS7SUiDxtK2H3P71VPbr+g9B+Rb/fJwComAnt4WvhwuTBzZi4rK/+GjJfn5YH8Xu4/F8MaAhQb7u/3wAEbk5rp7Z06eAbRT3yJ22cOeXY6WYuGjbYA4jHZJO2x4ndl7/2CH1YdDy7NdrP7MFx9p9s+eVTE/LHCBSjEePS9EQdxTOHbFNTJ81FdXpv2DpGFtgu3DS9jUl7vrHCamvYCfFg9VqYXiHKtS+w5cR07exNeo8PT5bzRcDGtCkgubpEikwVmvusAfQ5V3o9H+QdAYuxNou616Itc3zd+HElV+zpvvJac0ntl6Lim2zg93aT20TbNt7+wKgRJCtZ7FEkK3Nq0zmo3T21DQieWVkzreZlmL7mppo+wdKVq9aVjDL2bv2+Pzs+ShXvA9bv4N2o6DNy7a2jEuw57crP8vqDJ6loUSZHL+/mb/DZZsU3Pd8AxTs5LZrVz2A34a15Onvt7A3NoEHv17PqO41eLR5eSz6V72IeaxW2x+qEmVso2mvxTBscxBeupi7re4Dtsu+vjlu5s4KgeeP2B7/xM0Xwu6EAT9lt63/0va1Vh9bMATbNDlWZw0AKczO/G3rBStVEfwyL/ufj4Z98yE9JTOAXcp8nmr7mp6a/Twt1fY6PQUemJE9lc+SMbBzNrQYDo2ftLUd2wpft8t7jRdOZgc7vzDbqHLXHBOx+4bCXR/aAptXQHZ4c/crMvNI6r8QKRDl/L2YPbg5r82O5NeI47z12262R59nbJ86eLhq3VGRQs1isU2/kjUFS1Zb5/+7ct+OY6DpM7Y/oLl6AnN8TcrsRclIs13mSruY+xgrP7DtU6FVdrBbNxH+fMc2ujdnb4lX6RzPy9h6VYrgH+NCIyPdtg520pkcj9OZX8/mbr90EQavy37vgtfgwELo8TE0eszWduZA3pcyhMwe4sxgd/Gc7R8JWSuzADhdZSotJ7fcvws5n5cIsD0vkWMqoNYv2h45uXnnXie8CFKwkwLj6erMx/3rUTfUj3fn7eGXiOPsO3GBrx5qSJi/JlkVcQguHv+8jB9k9gKet10eu3zWrdr32Xr+ci6Vl3gKMGxrK188a1sL+Z8E1YZnVme/nv20refxrvezJ4GO/Bl2zMi9nF7OGnO2Zb32vQPu+Tx7v1+HwPko6PwOBNe1te39A9Z9YXtutdqCiJObbZLvnF+dXLOfu3lnr5ACttVYks7aLvVl9YomnbUt4+fslvl+l8znrjnanLPrTUmw/ZxzTnod+bPtfsqavbPr3bcAfnnWFqLIwyxoly5mD6opVcG2bnXOfwB4B0N4r+wa7XVmfXXJ/Hlctt0lx9+E5sOh3kO5e4bLVIOX/s5+v5OrQnwmBTspUBaLhcdbViA8xIeh07ayJyaenhNX88n99WhbLcDs8kSkoFgstt43j5JXbus27sq2zu9Cqxdy3zOVdCb3/VNZ25JO2wKcp3/uY/y91La9w+jstjN/w4FFeau9dNXcr49usa1JfPF8dltCDBxZTZ54lMod7FZ+CIdXwX3/zQ41h1bCzEeufxyL1RZ2MtJs94y5eMKoHCOlI2fC/gW2Zeqygp2zmy0wZ3H3tf38PEtnfvW3rWFsf+5v6wGz5pgq52rnLaAG9Psubz+Hy/lXuvIeUScX2+fLFRTsxBR3VvTnt2EtGTx1K9uizvPYt5t4vmNVhrSrjNWq++5E5DJOzpkDMW7wH4BpKXApKXdbl/dsl/i8g7LbqnXLDk32e35z/D/o8jaLJfserSyd3rathZxz6bqK7eC+KbbnGenZ946lXf41x31nzpfNGBBU29brlvPyoZMr+Nxx2f1qKeTuacy47PK2xXaPYtbchlW72kJdmWrZu4Q2gsHrbYHNo6TmNizCtPKEmColLZ23f9vN1A1RAHQKD2R8v7r4uOt/KiIiN8QwbL1zWaNDs0aKWp1sQU3zDxZ5eckyhf6C9LFjx3jooYfw9/fH09OTevXqsWXLFrPLknzi5uzEu71r8/69dXB1trJ49wl6TVzDgRMJZpcmIlI0WCy2Hja3ErbLpd5BULKcrSdSoa7YKdTB7ty5c7Ro0QIXFxfmz5/P7t27GT9+PH5+fmaXJvmsX+Oy/PxMM0J83Tl4OpF7Pl/DvMiYf36jiIiI2BXqe+zGjRtH2bJlmTJlir2tfPny5hUkt1WdUD9+G9aSYT9uY+3fZxg8dStPt6nIS52r4exUqP8NIiIiUigU6r+Wc+fOpVGjRvTt25eAgADq16/P119/fd33pKSkEB8fb38kJOiSXlHiX8KN/z3ehKdbVwTgqxUHeWTKRs4mpppcmYiISOFXqIPdwYMHmTRpElWqVGHhwoU888wzDB8+nP/973/XfM/YsWPx9fW1P8LDwwuwYskPzk5WXrurBp8/2ABPVyfW/HWGnp+tZsfR82aXJiIiUqgV6lGxrq6uNGrUiLVr19rbhg8fzqZNm1i3bt1V35OSkkJKSor99bFjxwgPD9eo2CJq/4kEnv5+C4dOJ+LqbOWdXrXo16is2WWJiIgUGIcZFRscHHxFj1uNGjWIioq65nvc3Nzw8fGxP7y9va+5rxR+VQO9+XVoCzrWCCQ1LYOXf97Bo1M28tnSAyzZfYJj5y9SiP9tIiIiUqAK9eCJFi1asG9f7mVj9u/fT7ly5UyqSMzg4+7C5IEN+XzZX0xYsp/l+06xfN8p+3ZfDxeqB3lTI9iH8GAfagT7UCWwBO4uWoNWRESKl0Id7J577jmaN2/Oe++9R79+/di4cSOTJ09m8uTJZpcmBcxqtTCsQxXaVQ9g3d9n2B0Tz56YeP46eYG4i5fYcOgsGw5lL4fjZLVQsbQXNTKDXo1gb8KDfSjj7YbFopUtRETEMRXqe+wAfv/9d1577TUOHDhAhQoVeP7553nqqadu+P1aecKxpaSl89fJC+yJSWBPZtjbExPPuaRLV93f38vVHvSyQl/lgBK4aDoVEREppPKSZW4q2EVHR2OxWOwH37hxI9OmTSM8PJxBgwbdXNW3iYJd8WMYBifiU9gTE2/v2dsTE8+h04lkXOW33cXJQuUAb3uvXlbgK+XlWvDFi4iIXCYvWeamLsU++OCDDBo0iIEDBxIbG0unTp2oWbMmP/zwA7GxsYwePfqmChfJDxaLhSBfd4J83WlXPXvB8Iup6ew/kbNnz/Y8ISXN3jabY/b9A33c7CEvPLNnL9jXHV8PF13OFRGRQummgt3OnTtp0qQJAD/99BO1atVizZo1LFq0iGeeeUbBTgolD1cn6pb1o25ZP3ubYRgcPXcxV8/enpgEos4mcSI+hRPxuQdqAHi4OBGcGRyDfT3sz0P83Anysb3281T4ExGRgndTwe7SpUu4ubkBsGTJEu6++24AqlevTkyM1veUosNisVC2lCdlS3nSpWaQvT0h+RL7YhMyL+favkadTeJsYioXL6Vz8HQiB08nXvO47i5Wgn09CPJxJ9jXnWA/d4J8PQj2sT0P9vWgpMKfiIjks5sKdjVr1uTLL7+ke/fuLF68mP/7v/8D4Pjx4/j7++drgSJm8HZ3oVH5UjQqXypXe/KldGLjkomJSyY2/iLHzyfbX8fEXSQ2LpkziakkX8rg0OlEDl0n/Lk5W3P1/AX5uhPimxkAfW2BsJSXq8KfiIjcsJsKduPGjaN379588MEHPPLII9StWxewre2adYlWxBG5uzhRvrQX5Ut7XXOf5EvpnIxP4Xhm0MsKfTFxWSHwIqcvpJKSlsHhM0kcPpN0zWO5OlspW9KDuqF+9svINYK9cXPWHH0iInKlm57uJD09nfj4eEqWLGlvO3z4MJ6engQEBFznnQVLo2KlMEpJywx/5y8SG58Z/s5nhr/M16cSUq76XhcnC+HBPtTJDHv1yvpSsXQJrFb17ImIOKLbPir24kXbMk5Zoe7IkSPMmTOHGjVq0KVLl5s5pEix4ubsZL+371pS0zI4EZ/MXycvsP3oebZHn2f70TjOJqay/Wgc24/G8f36IwB4uzlTO9TX1qsX6ke9sn4E+boX1LcjIiKFxE0Fu3vuuYc+ffrwzDPPcP78eZo2bYqLiwunT59mwoQJPPvss/ldp0ix4+pstYe/rGlbskbxRkRnBb3zRB6LIyEljbV/n2Ht32fs7w/wdsvs0bOFvdqhvvh6uJj17YiISAG4qWC3detWPvroIwB+/vlnAgMD2bZtG7NmzWL06NEKdiK3Sc5RvD3rhgCQlp7BgZMX7EEvIjqO/ScSOJmQwuLdJ1i8+4T9/RXLeFEv8xJunVBfagT7aE1dEREHclPBLikpCW9vbwAWLVpEnz59sFqt3HnnnRw5ciRfCxSR63N2stonUr6/SRgASalp7Doez/bo80REn2fH0TiiziZx8FQiB08lMnubbSJmFycLNYJ97IMzdL+eiEjRdlPBrnLlyvzyyy/07t2bhQsX8txzzwFw8uRJfHx88rVAEck7T1dnGpcvReMc07XY7s3LvISb4369HUfj2JHjfr0Sbs7UvsPXHvTqlvUjyMdd066IiBQBNxXsRo8ezYMPPshzzz1H+/btadasGWDrvatfv36+Figi+aOUlyvtqgXQrtqV9+vtOHqe7dFxRB6L40JKGusOnmHdQd2vJyJS1Nz0dCexsbHExMRQt25drFYrABs3bsTHx4fq1avna5G3QtOdiNy4nPfrRWT26u0/kUB6xpX/m6hY2itzFK5v5vx6ul9PROR2yEuWuelgl/PDLBYLd9xxx60c5rZRsBO5NRdT09l1PM4e9LZHnyfq7JWTKl9+v17dUF8qldH9eiIit+q2z2OXkZHBO++8w/jx47lw4QIA3t7evPDCC4waNcregyciRZ+Hq9MVy6vpfj0RkcLppoLdqFGj+Oabb/j3v/9NixYtMAyDNWvWMGbMGJKTk3n33Xfzu04RKUSudb9edti7sfv16oT6UifUT/friYjkk5u6FBsSEsKXX37J3Xffnav9119/ZfDgwRw7dizfCrxVuhQrYo7c8+vZLuHuu4H79RqWK0WtO3zUqycikum2X4o9e/bsVQdIVK9enbNnz97MIUXEweSeX8/Wdq379Q6eTuTg6UTmZM6v17ySP/+6qwa17vA18TsQESl6birY1a1bl4kTJ/Lpp5/map84cSJ16tTJl8JExPFc7369HdFxRESfY03m0mg9PltNr3ohvNilGqElr72mroiIZLupYPf+++/TvXt3lixZQrNmzbBYLKxdu5bo6GjmzZuX3zWKiAO7/H69o+eS+HDhPn6JOM4vEceZFxnLoy3KM6RtZXw9dS+eiMj13NTw1TZt2rB//3569+7N+fPnOXv2LH369GHXrl1MmTIlv2sUkWIktKQnH99fn9+HtaR5JX9S0zOYvPIgrT9Yxn9WHSQlLd3sEkVECq1bnscup+3bt9OgQQPS0wvP/3g1eEKk6DIMg+X7T/HveXvZdyIBgLKlPHipS3V61A7WHHkiUizkJctowjkRKbQsFgvtqgUwb0Qr3r+3DoE+bkSfvcjwH7fR64s1rM8xjYqIiCjYiUgR4GS10K9xWZa92JYXO1fFy9WJHUfjuH/yep74dhMHMnvzRESKOwU7ESkyPF2dGdq+CitebsfAO8vhZLWwdO9Juny8ktdm7+BkfLLZJYqImCpPo2L79Olz3e3nz5+/lVpERG5I6RJu/F+vWjzaojzvL9jLwl0n+HFjNL9GHOepVhUZ1LoiXm43NehfRKRIy9P/+Xx9rz9ZqK+vLw8//PAtFSQicqMqlSnBVwMbsenwWd6bt4dtUef5ZOkBpm6IYmTHKtzfuCzOTrowISLFR76Oii2MNCpWpHgwDIP5O2N5f8FeDp9JAqBSGS9e7VaDjjUCtESZiBRZGhUrIsWOxWLhrtrBLHquDWN6hlPKy5W/TyXy1P8203/yeiKiz5tdoojIbadgJyIOxdXZyqMtKrD8pbYMblsJN2crGw+dpdfnaxg6bStRmb15IiKOSMFORBySj7sLL3etzvKX2nJfw1AsFvh9RwwdJizn7d92cy4x1ewSRUTynYKdiDi0YF8PPuxblz+GtaJ11TJcSjf475pDtP5gGV+u+JvkS4VnpRwRkVulYCcixUJ4iA//e7wJ3z/RhPBgHxKS0/j3/L20/3A5s7ceJSPDoceRiUgxoWAnIsVKqypl+H1YS8b3rUuIrzvH45J5/qft9PhsNUv3nMDBJwoQEQenYCcixY7VauHehqH8+WJbXulaHW83Z3bHxPPEd5vpOXE1i3bFKuCJSJGkYCcixZa7ixPPtq3Eipfb8XSbini6OrHzWDyDvt9C909Xs2BnrC7RikiRomAnIsVeKS9XXutWg9WvtGdw20p4uTqxOyaeZ37Ywl2frmJeZIwCnogUCQp2IiKZSnm58nLX6qx+pT3D2lfG282ZvbEJDJ66la6frOS37cdJV8ATkUJMwU5E5DIlvVx5oXM1Vr/SnuEdquDt7sz+ExcY9uM2uny8kl8jjingiUihpGAnInINvp4uPN+pKqtfac9zHavi4+7MXycvMGJ6BJ0mrGD21qOkpWeYXaaIiJ2CnYjIP/D1cGFExyqsebU9L3auip+nCwdPJ/L8T9vpOGEFMzdHK+CJSKGgYCcicoO83V0Y2r4Kq19pz8tdq1HS04XDZ5J46ecdtB+/ghmborikgCciJlKwExHJoxJuzgxuW5nVr7TntW7V8fdyJepsEq/MiqTdh8v5cWMUqWkKeCJS8BTsRERukpebM0+3qcSqV9rxevcalC7hxtFzF3ltti3g/bD+CClpWotWRAqOgp2IyC3ydHXmyVYVWfVyO97oEU4ZbzeOnb/I67/spO0Hy/nfusMkX1LAE5HbT8FORCSfeLg68UTLCqx6uR1jeoYT6ONGTFwyo3/dRZsPljFlzSEFPBG5rRTsRETymbuLE4+2qMCKl9rxf/fUJNjXnRPxKbz1225avb+Mb1Yf4mKqAp6I5D8FOxGR28TdxYmBzcqz/KW2vNu7Fnf4eXAqIYX/+90W8L5eeZCk1DSzyxQRB6JgJyJym7k5OzGgaTmWvdiWsX1qE1rSg9MXUnh33h5ajVvGBwv3cuRMotlliogDsBiG4dDr4hw9epSyZcsSHR1NaGio2eWIiHApPYM5W48xcdlfRJ1Nsrc3rVCK/o3L0q1WMB6uTiZWKCKFSV6yjIKdiIhJ0tIzWLT7BDM2RbPywCmy/m/s7eZMz3oh9GtUlrqhvlgsFnMLFRFT5SXLOBdQTSIichlnJyt31Q7mrtrBHD9/kVlbjvLTlmiiz15k2oYopm2IolqgN/0al6V3/Tso5eVqdskiUsipx05EpBDJyDBYf+gMP22KZv7OWFIyV7BwcbLQKTyQvo3K0rpKGZys6sUTKS7UYyciUkRZrRaaVypN80qleeviJeZuP87MzdHsOBrHvMhY5kXGEuTjzn0NQ+nbKJRy/l5mlywihYh67EREioDdx+P5aXM0v0Qc43zSJXv7nRVtAy661tSACxFHpcETOSjYiYgjSUlLZ/HuE/y0+SirLhtwcXfmgIs6GnAh4lB0KVZExEG5OTvRo04IPeqEcCxrwMXmaI6eu8jUDVFM3RBF9SBv+jbSgAuR4kg9diIiRVxGhsH6g2eYsdk24CL1sgEX/RqVpZUGXIgUWeqxExEpRqxWC80rl6Z55dK8nXSJuduPMWNzNDuPxdsHXAT7Zg64aFiWMH9Ps0sWkdtEPXYiIg5q1/E4Zm4+ypxtx4i7mD3gollFf9sKF7WDcHPWgAuRwk6DJ3JQsBOR4i75UtaAi2hW/3XaPuCickAJPu5fj1p3+JpboIhcV16yjLWAahIREZO4uzjRs24I3z/RlFUvt+O5jlUpXcKVv05eoNfna/hs6QHS0jPMLlNE8oGCnYhIMRJa0pMRHauw6Lk23FU7iLQMg/GL93Pfl+s4eOqC2eWJyC1SsBMRKYZKebny+YMN+Kh/XbzdnYmIPs9dn67i+3WHcfA7dEQcmoKdiEgxZbFY6F0/lIUjW9Oisj/JlzJ449ddPPzfjcTGJZtdnojcBAU7EZFiLsTPg+8fb8qbPcNxc7ay6sBpuny8krnbj5tdmojkUZEKdmPHjsVisTBy5EizSxERcShWq4XHWlTgj+EtqX2HL3EXLzH8x20M+3Eb55NSzS5PRG5QkQl2mzZtYvLkydSpU8fsUkREHFblAG9mD27OiA5VcLJa+G37cbp8vJIV+0+ZXZqI3IAiEewuXLjAgAED+PrrrylZsqTZ5YiIODQXJyvPdarKrGebU7G0FyfiU3jkvxt545edJKWmmV2eiFxHkQh2Q4YMoXv37nTs2NHsUkREio16Zf34Y3grHmlWDoDv1x+h+6er2RZ1zuTKRORaCv1asdOnT2fLli1s3rz5hvZPSUkhJSXF/johIeF2lSYi4vA8XJ14655adAwP5KWZOzh0OpF7J61lSLvKDO9QBRenItE/IFJsFOr/IqOjoxkxYgRTp07F3d39ht4zduxYfH197Y/w8PDbXKWIiONrVaUMC0e25p56IWQY8Nmff9H7izUcOKF/PIsUJoV6rdhffvmF3r174+SUvUh1eno6FosFq9VKSkpKrm1wZY/dsWPHCA8P11qxIiL55Pcdxxk1ZydxFy/h6mzl5S7VeLxFBaxWi9mliTikvKwVW6gvxXbo0IHIyMhcbY899hjVq1fnlVdeuSLUAbi5ueHm5mZ/HR8ff9vrFBEpTnrUCaFx+VK8/PMOVuw/xTt/7GHpnpN82K8ud/h5mF2eSLFWqC/Fent7U6tWrVwPLy8v/P39qVWrltnliYgUW4E+7nz7WGPe6VULDxcn1h08Q9ePVjJry1EtSSZiokId7EREpPCyWCw8dGc55o1oRf0wPxJS0nhh5nae/WErZy6k/PMBRCTfFepLsVezfPlys0sQEZEcKpT2YubTzfhq5UE+WryfBbti2XzkHOPurU2HGoFmlydSrKjHTkREbpmzk5Uh7Srzy5AWVAkowekLKTzx3WZenbWDCyma1FikoCjYiYhIvql1hy+/DWvJky0rYLHA9E3RdPtkJZsOnzW7NJFiQcFORETylbuLE6/3CGfak3dyh58H0Wcv0u+rdYydv4eUtHSzyxNxaAp2IiJyWzSr5M+Cka24r2EohgFfrTjIPRPXsPNYnNmliTgsBTsREbltvN1d+LBvXb58qCGlvFzZG5tAz4mreW12JKc1clYk3ynYiYjIbde1VhALR7amR51gDAN+3BhFuw+W859VB0lNyzC7PBGHoWAnIiIFooy3GxMfbMBPTzejZogPCSlpvPPHHrp+vJKle05oYmORfKBgJyIiBapJhVLMHdqScffWpnQJVw6eTuSJ7zbzyJRNHDiRYHZ5IkWagp2IiBQ4J6uF/o3DWPZiW55uUxEXJwsr95+i6yerGDN3F+eTUs0uUaRIUrATERHTeLu78Fq3Gix+rg2dwgNJzzD4du1h2n64nP+tO0xauu6/E8kLBTsRETFd+dJefP1wI6Y+2ZRqgd6cT7rE6F93cdenq1h14JTZ5YkUGQp2IiJSaLSoXJo/hrfk/+6pSUlPF/afuMDAbzby5HebOXQ60ezyRAo9BTsRESlUnJ2sDGxWnuUvtuOxFuVxslpYsucEnT9awXvz9hCffMnsEkUKLQU7EREplHw9XXizZ00WjmxFm6pluJRuMHnlQdp/uJzpG6NIz9D0KCKXU7ATEZFCrXKAN9893oQpjzamYhkvTl9I5dXZkfT8bDUbDp4xuzyRQkXBTkREioR21QNYMKI1r3evgbe7M7tj4uk/eT1Dpm4l+myS2eWJFAoKdiIiUmS4Olt5slVFlr/YlgFNw7Ba4I/IGDpMWMH4RftITEkzu0QRUynYiYhIkeNfwo13e9fmj+GtaFbRn9S0DD778y/aj1/O7K1HydD9d1JMKdiJiEiRVSPYh2lPNeXLhxoSVsqTE/EpPP/TdvpMWsvWqHNmlydS4BTsRESkSLNYLHStFcSi51rzctdqeLk6ERF9nj5frOW5GRHExiWbXaJIgVGwExERh+Du4sTgtpVZ9mJb+jYMxWKBOduO0e7D5Xy69ADJl9LNLlHktlOwExERhxLg484Hfevy65AWNCpXkouX0pmweD8dxq/gjx0xGIbuvxPHpWAnIiIOqU6oHzOfacanD9QnxNedY+cvMmTaVh74ej27j8ebXZ7IbaFgJyIiDstisXB33RCWvtCWER2q4OZsZf3Bs/T4bBWj5kRyNjHV7BJF8pWCnYiIODwPVyee61SVP19sS/c6wWQYMHVDFG0/WMaUNYe4lJ5hdoki+ULBTkREio07/Dz4/MEGzBh0JzWCfYhPTuOt33Zz1yerWHXglNnlidwyBTsRESl2mlb05/dhLXmvd21Kebly4OQFBn6zkSe/28yRM4lmlydy0xTsRESkWHKyWniwaRjLXmjL4y0q4Gy1sGTPCTpNWMm/5+/lgpYnkyJIwU5ERIo1X08XRvcMZ8HIVrSqUprU9Ay+XPE37T5czs9btDyZFC0KdiIiIkDlAG/+93gT/vNwI8r7e3IqIYUXZ26n96S1bNPyZFJEKNiJiIhkslgsdAwPZOFzrXm1W3W8XJ3YHn2e3l+s5fkZEZyI1/JkUrgp2ImIiFzGzdmJZ9pUYtlLtuXJAGZnLk/2xfK/tDyZFFoKdiIiItcQ4J29PFn9MD+SUtN5f8E+On+0kkW7YrU8mRQ6CnYiIiL/oG5ZP2Y905yP+tcl0MeNqLNJDPp+CwO/2cj+Ewlmlydip2AnIiJyA6xWC73rh/LnC20Z0q4Srs5WVv91mm6frGLM3F2cT9LyZGI+BTsREZE88HJz5qUu1VnyXBu61AwkPcPg27WHaffhcr5ff4Q0LU8mJlKwExERuQlh/p58NbARU59sStXAEpxLusQbv+ykx2erWff3GbPLk2JKwU5EROQWtKhcmnnDW/H2PTXx9XBhb2wCD3y9nsFTtxB9Nsns8qSYUbATERG5Rc5OVh5uVp7lL7bl4WblsFpgXmQsHSas4IOFe4m7eMnsEqWYULATERHJJyW9XHn7nlrMG9GKZhX9SU3L4PNlf9Nq3J98vuwvErX+rNxmCnYiIiL5rHqQD9OeaspXAxtSNbAE8clpfLBwH63eX8Z/Vh3UBMdy2yjYiYiI3AYWi4UuNYOYP6I1n9xfj/L+npxNTOWdP/bQ+v1lfL/uMKlpGkEr+UvBTkRE5DZyslq4p94dLHm+De/fW4c7/Dw4mZDCG7/uot2Hy/lpU7SmSJF8o2AnIiJSAJydrPRrXJY/X2zD/91TkwBvN46dv8jLs3bQ6aOV/BpxjIwMLVEmt0bBTkREpAC5OTsxsFl5VrzUjlF31aCUlyuHTicyYnoE3T5ZxYKdWoNWbp6CnYiIiAk8XJ14qnVFVr7cjhc7V8Xb3Zl9JxJ45oct3D1xDcv2nVTAkzxTsBMRETFRCTdnhravwuqX2zOsfWW8XJ2IPBbHY1M2cd+X61j792mzS5QiRMFORESkEPD1dOGFztVY+XI7BrWuiJuzlS1HzvHg1xsY8J/1bDlyzuwSpQhQsBMRESlE/Eu48a+7arDy5XY83KwcLk4W1vx1hnsnreXxbzex81ic2SVKIaZgJyIiUggF+rjz9j21WPZiW/o3KouT1cKfe0/S47PVPPvDFvafSDC7RCmEFOxEREQKsdCSnoy7rw5Lnm9Dr3ohWCwwf2csXT5eycjp2zh8OtHsEqUQUbATEREpAiqU9uLj++uzcGRrutUKwjDgl4jjdJiwgldn7eDY+YtmlyiFgIKdiIhIEVI10JtJDzXkt6EtaVetDOkZBtM3RdPug+W8+etOTsYnm12imEjBTkREpAiqHerLlMeaMOvZZjSv5E9qegbfrTtC6w+W8d68PZxNTDW7RDGBgp2IiEgR1rBcKaY9dSfTnmxKgzA/ki9lMHnlQZr/eykvzdxORPR5TXRcjDibXYCIiIjcuuaVSzOrkj/L959iwqL9RB6LY+aWo8zccpSaIT4MaFqOu+uFUMJNf/odmcVw8Bh/9OhRypYtS3R0NKGhoWaXIyIictsZhsHWqHNM3RDF7ztiSE3LAMDL1Yle9e9gQNNyhIf4mFyl3Ki8ZBkFOxEREQd2LjGVWVuPMm1DFAdzTI1SP8yPB5uE0aNOCB6uTiZWKP9EwS4HBTsRERFbL976g2eZuuEIC3fFcind9uffx92ZexuGMqBpGJUDvE2uUq4mL1lGF9pFRESKAYvFQrNK/jSr5M+phBRmbolm2oYojp67yJQ1h5my5jBNK5TiwaZhdK0VhJuzevGKIgU7ERGRYqaMtxuD21bmmdaVWHngFNM2RLFkzwk2HDrLhkNnKeXlSt9GoTzYJIxy/l5mlyt5oGAnIiJSTFmtFtpWC6BttQBi4i4yY1M00zdGExufzFcrDvLVioO0qlKaAU3D6FAjEBcnzZJW2OkeOxEREbFLS8/gz70nmbohipUHTpGVEgK83bi/cVn6NwnjDj8Pc4ssZjR4IgcFOxERkZsTfTaJHzdG8dPmaE5fsK1kYbVA++oBDGhajtZVy+BktZhcpeNTsMtBwU5EROTWpKZlsGh3LFPXR7Hu4Bl7+x1+HjzQpCz9GpUlwMfdxAodm4JdDgp2IiIi+efvUxf4cUMUM7ccJe7iJQCcrRY61wzkwSblaF7JH6t68fKVgl0OCnYiIiL5L/lSOvMiY5i6IYotR87Z2ysHlGBCv7rUCfUzrzgHk5csU6iHt4wdO5bGjRvj7e1NQEAAvXr1Yt++fWaXJSIiUuy5uzjRp0Eos55tzoKRrXi4WTlKuDnz18kL3DdpHT+sP4KD9x0VSoU62K1YsYIhQ4awfv16Fi9eTFpaGp07dyYxMfGf3ywiIiIFonqQD2/fU4s1r7anc3ggqekZvP7LTp6bEUFiSprZ5RUrRepS7KlTpwgICGDFihW0bt36ht6jS7EiIiIFxzAM/rPqEP9esJf0DIPKASX48qEGWq7sFjjMpdjLxcXFAVCqVKlr7pOSkkJ8fLz9kZCQUFDliYiIFHsWi4WnWldk+qA7CfRx46+TF7h74hp+jThmdmnFQpEJdoZh8Pzzz9OyZUtq1ap1zf3Gjh2Lr6+v/REeHl6AVYqIiAhA4/Kl+GN4K1pU9icpNZ0R0yN4/ZdIUtLSzS7NoRWZYDd06FB27NjBjz/+eN39XnvtNeLi4uyP3bt3F1CFIiIiklPpEm787/GmDG9fGYsFflgfRd8v1xF9Nsns0hxWkQh2w4YNY+7cuSxbtuwfry27ubnh4+Njf3h765q+iIiIWZysFp7vXI0pjzampKcLO47G0eOz1Szdc8Ls0hxSoQ52hmEwdOhQZs+ezZ9//kmFChXMLklERERuQttqAfwxvBX1w/yIu3iJJ77bzLgFe0lLzzC7NIdSqIPdkCFD+OGHH5g2bRre3t7ExsYSGxvLxYsXzS5NRERE8ijEz4MZg5rxWIvyAExa/jcD/rOBk/HJ5hbmQAp1sJs0aRJxcXG0bduW4OBg+2PGjBlmlyYiIiI3wdXZyps9a/L5gw0o4ebMhkNnuevT1az7+8w/v1n+kbPZBVxPEZpiT0RERPKge51gagR7M3jqVvbGJjDgP+t5oXM1nm1TSWvN3oJC3WMnIiIijqtimRLMGdyCexuEkmHABwv38eT/NnM+KdXs0oosBTsRERExjYerEx/2rcO4e2vj6mzlz70n6f7parZHnze7tCJJwU5ERERMZbFY6N84jDmDm1PO35Nj5y/S98t1fL/usG7LyiMFOxERESkUaob48tuwlnSpGUhqegZv/LqL4dMjSExJM7u0IkPBTkRERAoNH3cXvnyoIa93r4Gz1cJv249z98TV7D+htd9vhIKdiIiIFCoWi4UnW1Vk+qA7CfJx5+9TidwzcQ1zth01u7RCT8FORERECqVG5Uvxx/CWtKpSmouX0nluxnb+NSeS5EvpZpdWaCnYiYiISKHlX8KNbx9rwogOVbBYYNqGKO77ci1RZ5LMLq1QUrATERGRQs3JauG5TlX57rEmlPR0YeexeLp/torFu0+YXVqho2AnIiIiRULrqmX4Y3grGoT5kZCcxlP/28zY+XtIS88wu7RCQ8FOREREiowQPw+mD2rG4y0qAPDVioM8+PUGTsQnm1xZ4aBgJyIiIkWKq7OV0T3DmTSgASXcnNl4+CzdPlnFO7/vZsuRc2RkFN9JjZ3NLkBERETkZnSrHUz1YB+e/WELe2MT+M/qQ/xn9SGCfd3pWiuIu2oH0zCsJFarxexSC4zFcPC1Oo4ePUrZsmWJjo4mNDTU7HJEREQkn6WmZbB830nmRcawZM9JLuRYqSLA241utYLoVjuYxuVL4VQEQ15esox67ERERKRIc3W20rlmEJ1rBpGSls7qA6f5IzKGxbtPcDIhhe/WHeG7dUcoXcKVLjWD6F47mCYVSuHs5Hh3pCnYiYiIiMNwc3aiQ41AOtQIJDUtgzV/n2bejhgW7T7B6QupTN0QxdQNUZTycqVLzUC61QqmWSV/XBwk5OlSrIiIiDi8S+kZrPv7DPMiY1i4K5ZzSZfs2/w8XegcHki32sG0qFQaV+fCFfLykmUU7ERERKRYSUvPYMOhs/aQd/pCqn2bj7szncKDuKt2EC2rlMbN2cnESm0U7HJQsBMREZFrSc8w2HjoLPN3xjB/ZyynElLs27zdnOkYHki3WkG0rloGdxdzQp6CXQ4KdiIiInIj0jMMthw5x7zIGObvjOFEfHbI83K13bt3V+0g2lQNwMO14EKegl0OCnYiIiKSVxkZBtuizzEvMpb5kTEcj8te2cLT1Yl21QO4q1Yw7aqXwdP19o5FVbDLQcFOREREboVhGEREn2f+zljmRcZw9NxF+zZ3FysjOlTl2baVbtvnax47ERERkXxisVioH1aS+mElea1bdSKPxTEv0hbyos4mEejjZnaJdgp2IiIiIjfIYrFQJ9SPOqF+vNK1Grtj4gkr5Wl2WXYKdiIiIiI3wWKxUDPE1+wycilcM/CJiIiIyE1TsBMRERFxEAp2IiIiIg5CwU5ERETEQSjYiYiIiDgIBTsRERERB6FgJyIiIuIgFOxEREREHISCnYiIiIiDULATERERcRAKdiIiIiIOQsFORERExEEo2ImIiIg4CGezC7jdMjIyAIiJiTG5EhEREZG8y8owWZnmehw+2J04cQKAJk2amFyJiIiIyM07ceIEYWFh193HYhiGUUD1mCItLY1t27YRGBiI1Xr7rjwnJCQQHh7O7t278fb2vm2fI3mj81I46bwUXjo3hZPOS+FUUOclIyODEydOUL9+fZydr98n5/DBrqDEx8fj6+tLXFwcPj4+ZpcjmXReCiedl8JL56Zw0nkpnArjedHgCREREREHoWAnIiIi4iAU7PKJm5sbb775Jm5ubmaXIjnovBROOi+Fl85N4aTzUjgVxvOie+xEREREHIR67EREREQchIKdiIiIiINQsBMRERFxEAp2+eCLL76gQoUKuLu707BhQ1atWmV2ScXe2LFjady4Md7e3gQEBNCrVy/27dtndllymbFjx2KxWBg5cqTZpRR7x44d46GHHsLf3x9PT0/q1avHli1bzC6r2EtLS+P111+nQoUKeHh4ULFiRd5+++0bWlpK8s/KlSvp2bMnISEhWCwWfvnll1zbDcNgzJgxhISE4OHhQdu2bdm1a5cptSrY3aIZM2YwcuRIRo0axbZt22jVqhXdunUjKirK7NKKtRUrVjBkyBDWr1/P4sWLSUtLo3PnziQmJppdmmTatGkTkydPpk6dOmaXUuydO3eOFi1a4OLiwvz589m9ezfjx4/Hz8/P7NKKvXHjxvHll18yceJE9uzZw/vvv88HH3zAZ599ZnZpxUpiYiJ169Zl4sSJV93+/vvvM2HCBCZOnMimTZsICgqiU6dOJCQkFHClGhV7y5o2bUqDBg2YNGmSva1GjRr06tWLsWPHmliZ5HTq1CkCAgJYsWIFrVu3NrucYu/ChQs0aNCAL774gnfeeYd69erx8ccfm11WsfXqq6+yZs0aXW0ohHr06EFgYCDffPONve3ee+/F09OT77//3sTKii+LxcKcOXPo1asXYOutCwkJYeTIkbzyyisApKSkEBgYyLhx43j66acLtD712N2C1NRUtmzZQufOnXO1d+7cmbVr15pUlVxNXFwcAKVKlTK5EgEYMmQI3bt3p2PHjmaXIsDcuXNp1KgRffv2JSAggPr16/P111+bXZYALVu2ZOnSpezfvx+A7du3s3r1au666y6TK5Mshw4dIjY2NlcWcHNzo02bNqZkgeuvJCvXdfr0adLT0wkMDMzVHhgYSGxsrElVyeUMw+D555+nZcuW1KpVy+xyir3p06ezZcsWNm/ebHYpkungwYNMmjSJ559/nn/9619s3LiR4cOH4+bmxsMPP2x2ecXaK6+8QlxcHNWrV8fJyYn09HTeffddHnjgAbNLk0xZf++vlgWOHDlS4PUo2OUDi8WS67VhGFe0iXmGDh3Kjh07WL16tdmlFHvR0dGMGDGCRYsW4e7ubnY5kikjI4NGjRrx3nvvAVC/fn127drFpEmTFOxMNmPGDH744QemTZtGzZo1iYiIYOTIkYSEhPDII4+YXZ7kUFiygILdLShdujROTk5X9M6dPHnyiuQu5hg2bBhz585l5cqVhIaGml1OsbdlyxZOnjxJw4YN7W3p6emsXLmSiRMnkpKSgpOTk4kVFk/BwcGEh4fnaqtRowazZs0yqSLJ8tJLL/Hqq69y//33A1C7dm2OHDnC2LFjFewKiaCgIMDWcxccHGxvNysL6B67W+Dq6krDhg1ZvHhxrvbFixfTvHlzk6oSsP1LaejQocyePZs///yTChUqmF2SAB06dCAyMpKIiAj7o1GjRgwYMICIiAiFOpO0aNHiiumA9u/fT7ly5UyqSLIkJSVhteb+U+3k5KTpTgqRChUqEBQUlCsLpKamsmLFClOygHrsbtHzzz/PwIEDadSoEc2aNWPy5MlERUXxzDPPmF1asTZkyBCmTZvGr7/+ire3t71X1dfXFw8PD5OrK768vb2vuM/Ry8sLf39/3f9ooueee47mzZvz3nvv0a9fPzZu3MjkyZOZPHmy2aUVez179uTdd98lLCyMmjVrsm3bNiZMmMDjjz9udmnFyoULF/jrr7/srw8dOkRERASlSpUiLCyMkSNH8t5771GlShWqVKnCe++9h6enJw8++GDBF2vILfv888+NcuXKGa6urkaDBg2MFStWmF1SsQdc9TFlyhSzS5PLtGnTxhgxYoTZZRR7v/32m1GrVi3Dzc3NqF69ujF58mSzSxLDMOLj440RI0YYYWFhhru7u1GxYkVj1KhRRkpKitmlFSvLli276t+URx55xDAMw8jIyDDefPNNIygoyHBzczNat25tREZGmlKr5rETERERcRC6x05ERETEQSjYiYiIiDgIBTsRERERB6FgJyIiIuIgFOxEREREHISCnYiIiIiDULATERERcRAKdiIiIiIOQsFORKSAWCwWfvnlF7PLEBEHpmAnIsXCo48+isViueLRtWtXs0sTEck3zmYXICJSULp27cqUKVNytbm5uZlUjYhI/lOPnYgUG25ubgQFBeV6lCxZErBdJp00aRLdunXDw8ODChUqMHPmzFzvj4yMpH379nh4eODv78+gQYO4cOFCrn3++9//UrNmTdzc3AgODmbo0KG5tp8+fZrevXvj6elJlSpVmDt3rn3buXPnGDBgAGXKlMHDw4MqVapcEURFRK5HwU5EJNMbb7zBvffey/bt23nooYd44IEH2LNnDwBJSUl07dqVkiVLsmnTJmbOnMmSJUtyBbdJkyYxZMgQBg0aRGRkJHPnzqVy5cq5PuOtt96iX79+7Nixg7vuuosBAwZw9uxZ++fv3r2b+fPns2fPHiZNmkTp0qUL7gcgIkWfISJSDDzyyCOGk5OT4eXllevx9ttvG4ZhGIDxzDPP5HpP06ZNjWeffdYwDMOYPHmyUbJkSePChQv27X/88YdhtVqN2NhYwzAMIyQkxBg1atQ1awCM119/3f76woULhsViMebPn28YhmH07NnTeOyxx/LnGxaRYkn32IlIsdGuXTsmTZqUq61UqVL2582aNcu1rVmzZkRERACwZ88e6tati5eXl317ixYtyMjIYN++fVgsFo4fP06HDh2uW0OdOnXsz728vPD29ubkyZMAPPvss9x7771s3bqVzp0706tXL5o3b35T36uIFE8KdiJSbHh5eV1xafSfWCwWAAzDsD+/2j4eHh43dDwXF5cr3puRkQFAt27dOHLkCH/88QdLliyhQ4cODBkyhA8//DBPNYtI8aV77EREMq1fv/6K19WrVwcgPDyciIgIEhMT7dvXrFmD1WqlatWqeHt7U758eZYuXXpLNZQpU4ZHH32UH374gY8//pjJkyff0vFEpHhRj52IFBspKSnExsbmanN2drYPUJg5cyaNGjWiZcuWTJ06lY0bN/LNN98AMGDAAN58800eeeQRxowZw6lTpxg2bBgDBw4kMDAQgDFjxvDMM88QEBBAt27dSEhIYM2aNQwbNuyG6hs9ejQNGzakZs2apKSk8Pvvv1OjRo18/AmIiKNTsBORYmPBggUEBwfnaqtWrRp79+4FbCNWp0+fzuDBgwkKCmLq1KmEh4cD4OnpycKFCxkxYgSNGzfG09OTe++9lwkTJtiP9cgjj5CcnMxHH33Eiy++SOnSpbnvvvtuuD5XV1dee+01Dh8+jIeHB61atWL69On58J2LSHFhMQzDMLsIERGzWSwW5syZQ69evcwuRUTkpukeOxEREREHoWAnIiIi4iB0j52ICLbpTEREijr12ImIiIg4CAU7EREREQehYCciIiLiIBTsRERERByEgp2IiIiIg1CwExEREXEQCnYiIiIiDkLBTkRERMRBKNiJiIiIOIj/B0cDoG9ZhEp2AAAAAElFTkSuQmCC",
|
||
"text/plain": [
|
||
"<Figure size 640x480 with 2 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"import matplotlib.pyplot as plt\n",
|
||
"\n",
|
||
"def plot_losses(epochs_seen, tokens_seen, train_losses, val_losses):\n",
|
||
" fig, ax1 = plt.subplots()\n",
|
||
"\n",
|
||
" # 绘制随着训练进行(epoch值增大)训练集损失和验证集损失的变化情况\n",
|
||
" ax1.plot(epochs_seen, train_losses, label=\"Training loss\")\n",
|
||
" ax1.plot(epochs_seen, val_losses, linestyle=\"-.\", label=\"Validation loss\")\n",
|
||
" ax1.set_xlabel(\"Epochs\")\n",
|
||
" ax1.set_ylabel(\"Loss\")\n",
|
||
" ax1.legend(loc=\"upper right\")\n",
|
||
"\n",
|
||
" # 创建第二个x轴用于显示可观察的tokens\n",
|
||
" ax2 = ax1.twiny() # 创建一个共享相同y轴的第二个x轴\n",
|
||
" ax2.plot(tokens_seen, train_losses, alpha=0) # 用于对齐刻度的不可见图表\n",
|
||
" ax2.set_xlabel(\"Tokens seen\")\n",
|
||
"\n",
|
||
" fig.tight_layout() # 调整布局以节省空间\n",
|
||
" plt.savefig(\"loss-plot.pdf\")\n",
|
||
" plt.show()\n",
|
||
"\n",
|
||
"epochs_tensor = torch.linspace(0, num_epochs, len(train_losses))\n",
|
||
"plot_losses(epochs_tensor, tokens_seen, train_losses, val_losses)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "8bc83ded-5f80-4e1c-bf4d-ccb59999d995",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 从上面的结果来看,我们可以看到模型开始生成无法理解的单词串,而到了后期,它能够产生语法上或多或少正确的句子。\n",
|
||
"- 然而,根据训练集和验证集的损失情况,我们可以看到模型开始过拟合。\n",
|
||
"- 如果我们检查它在训练结束时写的几段文本,我们会发现自己它们与训练集中的内容几乎一字不差——它只是简单地记住了训练数据。\n",
|
||
"- 稍后,我们将讨论一些解码策略,可以在一定程度上减轻这种记忆现象。\n",
|
||
"- 请注意,这里的过拟合是因为我们有一个非常非常小的训练集,而且我们多次迭代它。\n",
|
||
" - 这里的大型语言模型训练主要是出于教育目的;我们主要想看到模型能够学习产生连贯的文本。\n",
|
||
" - 我们不会花费数周或数月的时间在昂贵的硬件上训练这个模型,我们将在后续加载预训练的权重。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "eb380c42-b31c-4ee1-b8b9-244094537272",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/mental-model-2.webp\" width=350px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "de713235-1561-467f-bf63-bf11ade383f0",
|
||
"metadata": {},
|
||
"source": [
|
||
"**如果您对使用更先进的技术来增强这个训练函数感兴趣,例如学习率预热、余弦退火和梯度裁剪,请参考[附录D](../../appendix-D/03_main-chapter-code)。**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6d5cdf2f-09a5-4eb0-a20a-d7aac5c14c2c",
|
||
"metadata": {},
|
||
"source": [
|
||
"**如果您对更大的训练数据集和更长时间的训练感兴趣,请查看 [../03_bonus_pretraining_on_gutenberg](../03_bonus_pretraining_on_gutenberg)**"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "699f45fc-bf78-42f2-bd24-2355db41b28f",
|
||
"metadata": {
|
||
"id": "699f45fc-bf78-42f2-bd24-2355db41b28f"
|
||
},
|
||
"source": [
|
||
"## 5.3 解码策略以控制随机性"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6be9086e-2c27-41da-97d0-49137d0ba3c7",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 使用相对较小的大型语言模型(如我们上面训练的GPT模型),推理过程相对廉价,因此如果您在上面使用GPU进行了训练,那么在推理时就不需要使用GPU。\n",
|
||
"- 使用我们之前在简单训练函数中使用的`generate_text_simple`函数(来自上一章),我们可以一次生成一个单词(或标记)的新文本。\n",
|
||
"- 如5.1.2节所解释的,下一个生成的标记是词汇表中所有标记中对应最大概率分数的标记。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 33,
|
||
"id": "2734cee0-f6f9-42d5-b71c-fa7e0ef28b6d",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Output text:\n",
|
||
" Every effort moves you?\"\n",
|
||
"\n",
|
||
"\"Yes--quite insensible to the irony. Gisburn's it was no great, in fact,\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"model.to(\"cpu\")\n",
|
||
"model.eval()\n",
|
||
"\n",
|
||
"tokenizer = tiktoken.get_encoding(\"gpt2\")\n",
|
||
"\n",
|
||
"token_ids = generate_text_simple(\n",
|
||
" model=model,\n",
|
||
" idx=text_to_token_ids(\"Every effort moves you\", tokenizer),\n",
|
||
" max_new_tokens=25,\n",
|
||
" context_size=GPT_CONFIG_124M[\"ctx_len\"]\n",
|
||
")\n",
|
||
"\n",
|
||
"print(\"Output text:\\n\", token_ids_to_text(token_ids, tokenizer))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d25dbe31-bb7c-4893-b25b-47d0492d4aa4",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 即使我们多次执行上面的`generate_text_simple`函数,大型语言模型(LLM)始终会生成相同的输出。\n",
|
||
"- 现在我们引入两个概念,所谓的解码策略,来修改`generate_text_simple`:*温度缩放*和*top-k*采样。\n",
|
||
"- 这将允许模型控制生成文本的随机性和多样性。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4bb6f380-a798-4fd9-825c-17b7cd29a994",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 5.3.1 温度缩放"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "a7f4f53c-0612-43d3-aa82-52447eac50fa",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 之前,我们总是使用`torch.argmax`采样最大概率的标记作为下一个标记。\n",
|
||
"- 为了增加多样性,我们可以使用`torch.multinomial(probs, num_samples=1)`从概率分布中采样下一个标记。\n",
|
||
"- 在这里,每个索引被选中的机会与其在输入张量中的概率相对应。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "e7531bae-d5de-44c0-bc78-78fed077e22a",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 这里是一个关于生成下一个标记的小回顾,假设一个非常小的词汇表,仅用于说明目的:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 34,
|
||
"id": "01a5ce39-3dc8-4c35-96bc-6410a1e42412",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"forward\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"vocab = { \n",
|
||
" \"closer\": 0,\n",
|
||
" \"every\": 1, \n",
|
||
" \"effort\": 2, \n",
|
||
" \"forward\": 3,\n",
|
||
" \"inches\": 4,\n",
|
||
" \"moves\": 5, \n",
|
||
" \"pizza\": 6,\n",
|
||
" \"toward\": 7,\n",
|
||
" \"you\": 8,\n",
|
||
"} \n",
|
||
"\n",
|
||
"inverse_vocab = {v: k for k, v in vocab.items()}\n",
|
||
"\n",
|
||
"# 假设input是 \"every effort moves you\", 模型返回的logits值为下面tensor中的数值:\n",
|
||
"next_token_logits = torch.tensor(\n",
|
||
" [4.51, 0.89, -1.90, 6.75, 1.63, -1.62, -1.89, 6.28, 1.79]\n",
|
||
")\n",
|
||
"\n",
|
||
"probas = torch.softmax(next_token_logits, dim=0)\n",
|
||
"next_token_id = torch.argmax(probas).item()\n",
|
||
"\n",
|
||
"# 下一个标记:\n",
|
||
"print(inverse_vocab[next_token_id])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"id": "6400572f-b3c8-49e2-95bc-433e55c5b3a1",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"toward\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"torch.manual_seed(123)\n",
|
||
"next_token_id = torch.multinomial(probas, num_samples=1).item()\n",
|
||
"print(inverse_vocab[next_token_id])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 36,
|
||
"id": "b23b863e-252a-403c-b5b1-62bc0a42319f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"71 x closer\n",
|
||
"2 x every\n",
|
||
"0 x effort\n",
|
||
"544 x forward\n",
|
||
"2 x inches\n",
|
||
"1 x moves\n",
|
||
"0 x pizza\n",
|
||
"376 x toward\n",
|
||
"4 x you\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"def print_sampled_tokens(probas):\n",
|
||
" torch.manual_seed(123) # Manual seed for reproducibility\n",
|
||
" sample = [torch.multinomial(probas, num_samples=1).item() for i in range(1_000)] # 使用torch.multinomial函数从probas中进行了1000次采样\n",
|
||
" sampled_ids = torch.bincount(torch.tensor(sample)) # 使用torch.bitcount函数统计每个token的采样数量\n",
|
||
" for i, freq in enumerate(sampled_ids):\n",
|
||
" print(f\"{freq} x {inverse_vocab[i]}\")\n",
|
||
"\n",
|
||
"print_sampled_tokens(probas)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c63d0a27-830b-42b5-9986-6d1a7de04dd9",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们不是通过`torch.argmax`来确定最可能的标记,而是使用`torch.multinomial(probas, num_samples=1)`从softmax分布中采样来确定最可能的标记。\n",
|
||
"- 为了说明,让我们看看当我们使用原始的softmax概率采样1000次下一个标记时会发生什么:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "32e7d9cf-a26d-4d9a-8664-4af1efa73832",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们可以通过一个称为温度缩放的概念来控制分布和选择过程。\n",
|
||
"- “温度缩放”只是将logits除以一个大于0的数字的高级说法。\n",
|
||
"- 大于1的温度值将在应用softmax后导致更均匀分布的标记概率。\n",
|
||
"- 小于1的温度值将在应用softmax后导致更自信(更尖锐或更高峰)的分布。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 37,
|
||
"id": "0759e4c8-5362-467c-bec6-b0a19d1ba43d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def softmax_with_temperature(logits, temperature):\n",
|
||
" scaled_logits = logits / temperature\n",
|
||
" return torch.softmax(scaled_logits, dim=0)\n",
|
||
"\n",
|
||
"# Temperature values\n",
|
||
"temperatures = [1, 0.1, 5] # Original, higher confidence, and lower confidence\n",
|
||
"\n",
|
||
"# Calculate scaled probabilities\n",
|
||
"scaled_probas = [softmax_with_temperature(next_token_logits, T) for T in temperatures]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 38,
|
||
"id": "2e66e613-4aca-4296-a984-ddd0d80c6578",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "",
|
||
"text/plain": [
|
||
"<Figure size 640x480 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"# Plotting\n",
|
||
"x = torch.arange(len(vocab))\n",
|
||
"bar_width = 0.15\n",
|
||
"\n",
|
||
"fig, ax = plt.subplots()\n",
|
||
"for i, T in enumerate(temperatures):\n",
|
||
" # 条形图的绘制,ax.bar()函数里面的参数分别为条形的x轴位置、高度、宽度、图例标签\n",
|
||
" rects = ax.bar(x + i * bar_width, scaled_probas[i], bar_width, label=f'Temperature = {T}')\n",
|
||
"\n",
|
||
"ax.set_ylabel('Probability')\n",
|
||
"ax.set_xticks(x)\n",
|
||
"ax.set_xticklabels(vocab.keys(), rotation=90)\n",
|
||
"ax.legend()\n",
|
||
"\n",
|
||
"plt.tight_layout()\n",
|
||
"# plt.savefig(\"temperature-plot.pdf\")\n",
|
||
"plt.show()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "d750e989-842a-4cfa-a44b-cf44d6e49163",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们可以看到,通过温度0.1进行重新缩放会得到一个更尖锐的分布,接近于`torch.argmax`,以至于最可能的单词几乎总是被选中:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 39,
|
||
"id": "e4600713-c51e-4f53-bf58-040a6eb362b8",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"0 x closer\n",
|
||
"0 x every\n",
|
||
"0 x effort\n",
|
||
"992 x forward\n",
|
||
"0 x inches\n",
|
||
"0 x moves\n",
|
||
"0 x pizza\n",
|
||
"8 x toward\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print_sampled_tokens(scaled_probas[1])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "526e93cb-8e2a-42a1-b1ba-4fd5fe64c26b",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 通过`temperature=5`重新缩放的概更加均匀:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 40,
|
||
"id": "9dfb48f0-bc3f-46a5-9844-33b6c9b0f4df",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"153 x closer\n",
|
||
"68 x every\n",
|
||
"55 x effort\n",
|
||
"223 x forward\n",
|
||
"102 x inches\n",
|
||
"50 x moves\n",
|
||
"43 x pizza\n",
|
||
"218 x toward\n",
|
||
"88 x you\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print_sampled_tokens(scaled_probas[2])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0c83f0c4-3774-4375-ad7f-96440ba5fef7",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 假设大型语言模型(LLM)的输入是“every effort moves you”,使用上述方法有时会产生无意义的文本,例如“every effort moves you pizza”,这种情况发生的频率是3.2%(在1000次中有32次)。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "c6e4873e-07e4-4abb-85df-bdaedcc1a6f7",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 5.3.2 Top-k采样"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6d4da95a-8bb2-4f69-a9b0-a643531db5df",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 为了能够使用更高的温度来增加输出的多样性,并降低无意义句子出现的概率,我们可以将采样的标记限制在最可能的前k个标记中:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "7ae6fffd-2730-4abe-a2d3-781fc4836f17",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/topk.webp\" width=500px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0ba12da5-6ff1-4008-91b8-d2d537cbc14c",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 在代码中,我们可以如下实现这一点:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 41,
|
||
"id": "2a7f908a-e9ec-446a-b407-fb6dbf05c806",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Top logits: tensor([6.7500, 6.2800, 4.5100])\n",
|
||
"Top positions: tensor([3, 7, 0])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"top_k = 3\n",
|
||
"top_logits, top_pos = torch.topk(next_token_logits, top_k)\n",
|
||
"\n",
|
||
"print(\"Top logits:\", top_logits)\n",
|
||
"print(\"Top positions:\", top_pos)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 42,
|
||
"id": "753865ed-79c5-48b1-b9f2-ccb132ff1d2f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"tensor([4.5100, -inf, -inf, 6.7500, -inf, -inf, -inf, 6.2800, -inf])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"new_logits = torch.where(\n",
|
||
" condition=next_token_logits < top_logits[-1],\n",
|
||
" input=torch.tensor(float('-inf')), \n",
|
||
" other=next_token_logits\n",
|
||
")\n",
|
||
"\n",
|
||
"print(new_logits)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 43,
|
||
"id": "4844f000-c329-4e7e-aa89-16a2c4ebee43",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"tensor([0.0615, 0.0000, 0.0000, 0.5775, 0.0000, 0.0000, 0.0000, 0.3610, 0.0000])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"topk_probas = torch.softmax(new_logits, dim=0)\n",
|
||
"print(topk_probas)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "56056503-a15d-4315-a3ff-46647a4c7c45",
|
||
"metadata": {},
|
||
"source": [
|
||
"### 5.3.3 修改文本生成函数"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "34770423-473d-46f6-a5fa-6b2979564d26",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 前两个小节介绍了温度采样和top-k采样。\n",
|
||
"- 让我们使用这两个概念来修改我们之前用于通过大型语言模型(LLM)生成文本的`generate_simple`函数,创建一个新的`generate`函数:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 44,
|
||
"id": "8e318891-bcc0-4d71-b147-33ce55febfa3",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def generate(model, idx, max_new_tokens, context_size, temperature, top_k=None):\n",
|
||
"\n",
|
||
" # 循环与之前相同:获取logits,并仅关注最后一步。\n",
|
||
" for _ in range(max_new_tokens):\n",
|
||
" idx_cond = idx[:, -context_size:]\n",
|
||
" with torch.no_grad():\n",
|
||
" logits = model(idx_cond)\n",
|
||
" logits = logits[:, -1, :]\n",
|
||
"\n",
|
||
" # 使用top_k采样对logits值进行过滤\n",
|
||
" if top_k is not None:\n",
|
||
" # 仅保留top_k的值\n",
|
||
" top_logits, _ = torch.topk(logits, top_k)\n",
|
||
" min_val = top_logits[:, -1]\n",
|
||
" logits = torch.where(logits < min_val, torch.tensor(float('-inf')).to(logits.device), logits)\n",
|
||
"\n",
|
||
" # 使用温度缩放\n",
|
||
" if temperature > 0.0:\n",
|
||
" logits = logits / temperature\n",
|
||
"\n",
|
||
" # 使用softmax函数得到概率\n",
|
||
" probs = torch.softmax(logits, dim=-1) # (batch_size, context_len)\n",
|
||
"\n",
|
||
" # 从概率分布中采样\n",
|
||
" idx_next = torch.multinomial(probs, num_samples=1) # (batch_size, 1)\n",
|
||
"\n",
|
||
" # 否则和之前的generate_simple函数中的处理相同,使用argmax函数取得概率最大的token\n",
|
||
" else:\n",
|
||
" idx_next = torch.argmax(logits, dim=-1, keepdim=True) # (batch_size, 1)\n",
|
||
"\n",
|
||
" # 和之前相同的序列拼接处理\n",
|
||
" idx = torch.cat((idx, idx_next), dim=1) # (batch_size, num_tokens+1)\n",
|
||
"\n",
|
||
" return idx"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 45,
|
||
"id": "aa2a0d7d-0457-42d1-ab9d-bd67683e7ed8",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Output text:\n",
|
||
" Every effort moves you know began to my surprise to the end it was such a laugh that there: \"sweet of an\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"torch.manual_seed(123)\n",
|
||
"\n",
|
||
"token_ids = generate(\n",
|
||
" model=model,\n",
|
||
" idx=text_to_token_ids(\"Every effort moves you\", tokenizer),\n",
|
||
" max_new_tokens=20,\n",
|
||
" context_size=GPT_CONFIG_124M[\"ctx_len\"],\n",
|
||
" top_k=10,\n",
|
||
" temperature=1.5\n",
|
||
")\n",
|
||
"\n",
|
||
"print(\"Output text:\\n\", token_ids_to_text(token_ids, tokenizer))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4e2002ca-f4c1-48af-9e0a-88bfc163ba0b",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 5.4 在PyTorch中加载和保存模型权重"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "0fc52676-f026-4566-a226-2a90269f9d53",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 训练大型语言模型(LLM)需要昂贵的计算资源,因此能够保存和加载LLM权重至关重要。\n",
|
||
"\n",
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/mental-model-3.webp\" width=400px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "10e4c7f9-592f-43d6-a00e-598fa01dfb82",
|
||
"metadata": {},
|
||
"source": [
|
||
"- PyTorch推荐的方式是保存模型权重,即所谓的`state_dict`,通过应用`torch.save`函数到`.state_dict()`方法来实现:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 46,
|
||
"id": "3d67d869-ac04-4382-bcfb-c96d1ca80d47",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"torch.save(model.state_dict(), \"model.pth\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "90e889e0-07bf-43e5-8f92-5c5c7aeaad9e",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 然后我们可以按照以下方式将模型权重加载到一个新的`GPTModel`模型实例中:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 47,
|
||
"id": "9d57d914-60a3-47f1-b499-5352f4c457cb",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"model = GPTModel(GPT_CONFIG_124M)\n",
|
||
"model.load_state_dict(torch.load(\"model.pth\"))\n",
|
||
"model.eval();"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "caa81aec-9c72-4f46-8ae2-4a4fde3edbc1",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 通常的做法是使用自适应优化器(如Adam或AdamW)而不是常规的SGD来训练大型语言模型(LLM)\n",
|
||
"- 这些自适应优化器会为每个模型权重存储额外的参数,因此如果我们计划稍后继续预训练,保存它们也是有意义的。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 48,
|
||
"id": "bbd175bb-edf4-450e-a6de-d3e8913c6532",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"torch.save({\n",
|
||
" \"model_state_dict\": model.state_dict(),\n",
|
||
" \"optimizer_state_dict\": optimizer.state_dict(),\n",
|
||
" }, \n",
|
||
" \"model_and_optimizer.pth\"\n",
|
||
")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 49,
|
||
"id": "8a0c7295-c822-43bf-9286-c45abc542868",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"checkpoint = torch.load(\"model_and_optimizer.pth\")\n",
|
||
"\n",
|
||
"model = GPTModel(GPT_CONFIG_124M)\n",
|
||
"model.load_state_dict(checkpoint[\"model_state_dict\"])\n",
|
||
"\n",
|
||
"optimizer = torch.optim.AdamW(model.parameters(), lr=5e-4, weight_decay=0.1)\n",
|
||
"optimizer.load_state_dict(checkpoint[\"optimizer_state_dict\"])\n",
|
||
"model.train();"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4194350e-0409-4a63-8ffd-d3a896509032",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 5.5 从Open AI加载预训练权重"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "83eb6c38-7278-40e0-bd9f-8a2b1feac3ec",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 之前,我们仅出于教育目的使用一本非常小的短篇小说书训练了一个小型的GPT-2模型。\n",
|
||
"- 感兴趣的读者还可以在[../03_bonus_pretraining_on_gutenberg](03_bonus_pretraining_on_gutenberg)中找到在完整的古登堡计划书库上进行更长时间预训练的信息。\n",
|
||
"- 幸运的是,我们不需要花费数万到数十万美元在大型预训练语料库上预训练模型,而是可以直接加载由OpenAI提供的预训练权重。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "127ddbdb-3878-4669-9a39-d231fbdfb834",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 从Hugging Face Hub加载权重的方法,请参见[../02_alternative_weight_loading](../02_alternative_weight_loading)。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "75cab892-a165-4f43-9601-f517bc212ab6",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 首先,一些模板代码用于从OpenAI下载文件并将权重加载到Python中。\n",
|
||
"- 由于OpenAI使用了[TensorFlow](https://www.tensorflow.org/),我们将不得不安装并使用TensorFlow来加载权重;[tqdm](https://github.com/tqdm/tqdm) 是一个进度条库。\n",
|
||
"- 取消注释并运行下一个单元格以安装所需的库。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 50,
|
||
"id": "fb9fdf02-972a-444e-bf65-8ffcaaf30ce8",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# pip install tensorflow tqdm"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 53,
|
||
"id": "a0747edc-559c-44ef-a93f-079d60227e3f",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"TensorFlow version: 2.15.0\n",
|
||
"tqdm version: 4.64.1\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(\"TensorFlow version:\", version(\"tensorflow\"))\n",
|
||
"print(\"tqdm version:\", version(\"tqdm\"))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 54,
|
||
"id": "c5bc89eb-4d39-4287-9b0c-e459ebe7f5ed",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"WARNING:tensorflow:From e:\\tools\\Anaconda3\\lib\\site-packages\\keras\\src\\losses.py:2976: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.\n",
|
||
"\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# 有关的函数导入\n",
|
||
"from gpt_download import download_and_load_gpt2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ff76a736-6f9f-4328-872e-f89a7b70a2cc",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 然后我们可以按照以下方式下载具有1.24亿参数的模型权重:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 58,
|
||
"id": "76271dd7-108d-4f5b-9c01-6ae0aac4b395",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"File already exists and is up-to-date: gpt2\\124M\\checkpoint\n",
|
||
"File already exists and is up-to-date: gpt2\\124M\\encoder.json\n"
|
||
]
|
||
},
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"models\\124M\\hparams.json: 100%|██████████| 90.0/90.0 [00:00<00:00, 43.1kiB/s]\n",
|
||
"models\\124M\\model.ckpt.data-00000-of-00001: 100%|██████████| 498M/498M [02:17<00:00, 3.62MiB/s] \n",
|
||
"models\\124M\\model.ckpt.index: 100%|██████████| 5.21k/5.21k [00:00<00:00, 1.45MiB/s]\n",
|
||
"models\\124M\\model.ckpt.meta: 100%|██████████| 471k/471k [00:01<00:00, 369kiB/s] \n",
|
||
"models\\124M\\vocab.bpe: 100%|██████████| 456k/456k [00:00<00:00, 494kiB/s] \n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"hparams, params = download_and_load_gpt2(model_size=\"124M\", models_dir=\"gpt2\")"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "437b20f0",
|
||
"metadata": {},
|
||
"source": [
|
||
"note: 如果出现报错:requests.exceptions.SSLError: Max retries exceeded weith url \n",
|
||
"更改urllib3的版本\n",
|
||
"`pip install urllib3==1.25.11`"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 59,
|
||
"id": "b1a31951-d971-4a6e-9c43-11ee1168ec6a",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Settings: {'n_vocab': 50257, 'n_ctx': 1024, 'n_embd': 768, 'n_head': 12, 'n_layer': 12}\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(\"Settings:\", hparams)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 60,
|
||
"id": "857c8331-130e-46ba-921d-fa35d7a73cfe",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Parameter dictionary keys: dict_keys(['blocks', 'b', 'g', 'wpe', 'wte'])\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(\"Parameter dictionary keys:\", params.keys())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "466e100c-294e-4afc-a70a-2f398ac4c104",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 另外,\"355M\"、\"774M\" 和 \"1558M\" 也是支持的 `model_size` 参数。\n",
|
||
"- 这些不同大小的模型之间的差异在下面的图表中进行了总结:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "20f19d32-5aae-4176-9f86-f391672c8f0d",
|
||
"metadata": {},
|
||
"source": [
|
||
"<img src=\"https://sebastianraschka.com/images/LLMs-from-scratch-images/ch05_compressed/gpt-sizes.webp?timestamp=123\" width=500px>"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "ea6e5076-f08d-41fc-bd8b-1cfe53538f41",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 上面,我们将124M GPT-2模型权重加载到了Python中,但我们还需要将它们转移到我们的`GPTModel`实例中。\n",
|
||
"- 首先,我们初始化一个新的GPTModel实例。\n",
|
||
"- 请注意,原始的GPT模型在多头注意力模块的查询、键和值矩阵的线性层中使用了带偏置向量的初始化,这是不必要的,也不推荐;然而,为了能够正确加载权重,我们也必须在我们的实现中通过设置`qkv_bias`为`True`来启用这些。\n",
|
||
"- 我们还使用了原始GPT-2模型使用的`1024`上下文窗口长度。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 61,
|
||
"id": "9fef90dd-0654-4667-844f-08e28339ef7d",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"# 将模型配置参数定义在一个字典中\n",
|
||
"model_configs = {\n",
|
||
" \"gpt2-small\": {\"emb_dim\": 768, \"n_layers\": 12, \"n_heads\": 12},\n",
|
||
" \"gpt2-medium\": {\"emb_dim\": 1024, \"n_layers\": 24, \"n_heads\": 16},\n",
|
||
" \"gpt2-large\": {\"emb_dim\": 1280, \"n_layers\": 36, \"n_heads\": 20},\n",
|
||
" \"gpt2-xl\": {\"emb_dim\": 1600, \"n_layers\": 48, \"n_heads\": 25},\n",
|
||
"}\n",
|
||
"\n",
|
||
"# 复制基础配置,并使用特定的模型设置进行更新\n",
|
||
"model_name = \"gpt2-small\" # Example model name\n",
|
||
"NEW_CONFIG = GPT_CONFIG_124M.copy()\n",
|
||
"NEW_CONFIG.update(model_configs[model_name])\n",
|
||
"NEW_CONFIG.update({\"ctx_len\": 1024, \"qkv_bias\": True})\n",
|
||
"\n",
|
||
"gpt = GPTModel(NEW_CONFIG)\n",
|
||
"gpt.eval();"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "272f29ac-8342-4b3d-a57d-9b0166ced314",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 接下来需要将OpenAI的权重分配给我们的`GPTModel`实例中相应的权重张量"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 62,
|
||
"id": "f9a92229-c002-49a6-8cfb-248297ad8296",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"def assign(left, right):\n",
|
||
" if left.shape != right.shape:\n",
|
||
" raise ValueError(f\"Shape mismatch. Left: {left.shape}, Right: {right.shape}\")\n",
|
||
" return torch.nn.Parameter(torch.tensor(right))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 64,
|
||
"id": "f22d5d95-ca5a-425c-a9ec-fc432a12d4e9",
|
||
"metadata": {},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"def load_weights_into_gpt(gpt, params):\n",
|
||
" # Weight tying\n",
|
||
" gpt.pos_emb.weight = assign(gpt.pos_emb.weight, params['wpe'])\n",
|
||
" gpt.tok_emb.weight = assign(gpt.tok_emb.weight, params['wte'])\n",
|
||
" \n",
|
||
" for b in range(len(params[\"blocks\"])):\n",
|
||
" q_w, k_w, v_w = np.split((params[\"blocks\"][b][\"attn\"][\"c_attn\"])[\"w\"], 3, axis=-1)\n",
|
||
" gpt.trf_blocks[b].att.W_query.weight = assign(gpt.trf_blocks[b].att.W_query.weight, q_w.T)\n",
|
||
" gpt.trf_blocks[b].att.W_key.weight = assign(gpt.trf_blocks[b].att.W_key.weight, k_w.T)\n",
|
||
" gpt.trf_blocks[b].att.W_value.weight = assign(gpt.trf_blocks[b].att.W_value.weight, v_w.T)\n",
|
||
" \n",
|
||
" q_b, k_b, v_b = np.split((params[\"blocks\"][b][\"attn\"][\"c_attn\"])[\"b\"], 3, axis=-1)\n",
|
||
" gpt.trf_blocks[b].att.W_query.bias = assign(gpt.trf_blocks[b].att.W_query.bias, q_b)\n",
|
||
" gpt.trf_blocks[b].att.W_key.bias = assign(gpt.trf_blocks[b].att.W_key.bias, k_b)\n",
|
||
" gpt.trf_blocks[b].att.W_value.bias = assign(gpt.trf_blocks[b].att.W_value.bias, v_b)\n",
|
||
" \n",
|
||
" gpt.trf_blocks[b].att.out_proj.weight = assign(gpt.trf_blocks[b].att.out_proj.weight, params[\"blocks\"][b][\"attn\"][\"c_proj\"][\"w\"].T)\n",
|
||
" gpt.trf_blocks[b].att.out_proj.bias = assign(gpt.trf_blocks[b].att.out_proj.bias, params[\"blocks\"][b][\"attn\"][\"c_proj\"][\"b\"])\n",
|
||
" \n",
|
||
" gpt.trf_blocks[b].ff.layers[0].weight = assign(gpt.trf_blocks[b].ff.layers[0].weight, params[\"blocks\"][b][\"mlp\"][\"c_fc\"][\"w\"].T)\n",
|
||
" gpt.trf_blocks[b].ff.layers[0].bias = assign(gpt.trf_blocks[b].ff.layers[0].bias, params[\"blocks\"][b][\"mlp\"][\"c_fc\"][\"b\"])\n",
|
||
" gpt.trf_blocks[b].ff.layers[2].weight = assign(gpt.trf_blocks[b].ff.layers[2].weight, params[\"blocks\"][b][\"mlp\"][\"c_proj\"][\"w\"].T)\n",
|
||
" gpt.trf_blocks[b].ff.layers[2].bias = assign(gpt.trf_blocks[b].ff.layers[2].bias, params[\"blocks\"][b][\"mlp\"][\"c_proj\"][\"b\"])\n",
|
||
" \n",
|
||
" gpt.trf_blocks[b].norm1.scale = assign(gpt.trf_blocks[b].norm1.scale, params[\"blocks\"][b][\"ln_1\"][\"g\"])\n",
|
||
" gpt.trf_blocks[b].norm1.shift = assign(gpt.trf_blocks[b].norm1.shift, params[\"blocks\"][b][\"ln_1\"][\"b\"])\n",
|
||
" gpt.trf_blocks[b].norm2.scale = assign(gpt.trf_blocks[b].norm2.scale, params[\"blocks\"][b][\"ln_2\"][\"g\"])\n",
|
||
" gpt.trf_blocks[b].norm2.shift = assign(gpt.trf_blocks[b].norm2.shift, params[\"blocks\"][b][\"ln_2\"][\"b\"])\n",
|
||
" \n",
|
||
" gpt.final_norm.scale = assign(gpt.final_norm.scale, params[\"g\"])\n",
|
||
" gpt.final_norm.shift = assign(gpt.final_norm.shift, params[\"b\"])\n",
|
||
" gpt.out_head.weight = assign(gpt.out_head.weight, params[\"wte\"])\n",
|
||
" \n",
|
||
" \n",
|
||
"load_weights_into_gpt(gpt, params)\n",
|
||
"gpt.to(device);"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "4f7472cb-54dc-4311-96d8-b2694f885cee",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 如果模型加载正确,我们可以使用它结合我们之前的`generate`函数来生成新文本:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 65,
|
||
"id": "1f690253-f845-4347-b7b6-43fabbd2affa",
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Output text:\n",
|
||
" Every effort moves you toward an equal share for each vote plus half. Inequality is often not an accurate representation of human worth; to know the\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"torch.manual_seed(123)\n",
|
||
"\n",
|
||
"token_ids = generate(\n",
|
||
" model=gpt,\n",
|
||
" idx=text_to_token_ids(\"Every effort moves you\", tokenizer),\n",
|
||
" max_new_tokens=25,\n",
|
||
" context_size=NEW_CONFIG[\"ctx_len\"],\n",
|
||
" top_k=50,\n",
|
||
" temperature=1.5\n",
|
||
")\n",
|
||
"\n",
|
||
"print(\"Output text:\\n\", token_ids_to_text(token_ids, tokenizer))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "6d079f98-a7c4-462e-8416-5a64f670861c",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 我们知道模型权重加载正确,因为模型能够生成连贯的文本;如果我们犯了哪怕很小的错误,模型也无法做到这一点。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "28493b9b-a1ae-4f31-87bc-c10ee4447f44",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 有关从Hugging Face Hub加载权重的替代方法,请参考[../02_alternative_weight_loading](../02_alternative_weight_loading)。"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "f2a66474-230d-4180-a8ff-843e04f1f1c4",
|
||
"metadata": {},
|
||
"source": [
|
||
"## 总结和要点"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"id": "fc7ed189-a633-458c-bf12-4f70b42684b8",
|
||
"metadata": {},
|
||
"source": [
|
||
"- 查看包含独立训练脚本的[gpt_train.py](gpt_train.py)文件。\n",
|
||
"- [gpt_generate.py](gpt_generate.py)文件从OpenAI加载预训练权重,并根据提示生成文本。\n",
|
||
"- 你可以在[exercise-solutions.ipynb](exercise-solutions.ipynb)中找到练习题的解答。"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"accelerator": "GPU",
|
||
"colab": {
|
||
"gpuType": "A100",
|
||
"machine_shape": "hm",
|
||
"provenance": []
|
||
},
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.9.13"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 5
|
||
}
|