llms-from-scratch-cn/Codes/appendix-A/03_main-chapter-code/code-part1.ipynb
2024-06-10 17:00:23 +08:00

1302 lines
30 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"id": "ca7fc8a0-280c-4979-b0c7-fc3a99b3b785",
"metadata": {},
"source": [
"# 附件APyTorch的介绍第一部分"
]
},
{
"cell_type": "markdown",
"id": "f5bf13d2-8fc2-483e-88cc-6b4310221e68",
"metadata": {},
"source": [
"## A.1 什么是PyTorch"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "96ee5660-5327-48e2-9104-a882b3b2afa4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2.0.1\n"
]
}
],
"source": [
"import torch\n",
"# 显示PyTorch的版本\n",
"print(torch.__version__)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f73ad4e4-7ec6-4467-a9e9-0cdf6d195264",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"False\n"
]
}
],
"source": [
"# 显示PyTorch是否是GPU版本False表示CPU版本True表示GPU版本\n",
"print(torch.cuda.is_available())"
]
},
{
"cell_type": "markdown",
"id": "2100cf2e-7459-4ab3-92a8-43e86ab35a9b",
"metadata": {},
"source": [
"## A.2 向量"
]
},
{
"cell_type": "markdown",
"id": "26d7f785-e048-42bc-9182-a556af6bb7f4",
"metadata": {},
"source": [
"### A.2.1 标量、向量、矩阵和张量\n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "a3a464d6-cec8-4363-87bd-ea4f900baced",
"metadata": {},
"outputs": [],
"source": [
"import torch\n",
"import numpy as np\n",
"\n",
"# 用Python整数创建一个0维张量\n",
"tensor0d = torch.tensor(1)\n",
"\n",
"# 用Python列表创建一个1维张量向量\n",
"tensor1d = torch.tensor([1, 2, 3])\n",
"\n",
"# 用Python列表创建一个2维张量向量\n",
"tensor2d = torch.tensor([[1, 2], [3, 4]])\n",
"\n",
"# 用嵌套的Python列表创建一个3维张量\n",
"tensor3d_1 = torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])\n",
"\n",
"# 从NumPy数组创建一个3维张量\n",
"ary3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])\n",
"tensor3d_2 = torch.tensor(ary3d) # 复制NumPy数组\n",
"tensor3d_3 = torch.from_numpy(ary3d) # 与NumPy数组共享内存"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "dbe14c47-499a-4d48-b354-a0e6fd957872",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[[1, 2],\n",
" [3, 4]],\n",
"\n",
" [[5, 6],\n",
" [7, 8]]])\n"
]
}
],
"source": [
"ary3d[0, 0, 0] = 999\n",
"print(tensor3d_2) # 保持不变"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "e3e4c23a-cdba-46f5-a2dc-5fb32bf9117b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[[999, 2],\n",
" [ 3, 4]],\n",
"\n",
" [[ 5, 6],\n",
" [ 7, 8]]])\n"
]
}
],
"source": [
"print(tensor3d_3) # 由于内存共享需要改变"
]
},
{
"cell_type": "markdown",
"id": "63dec48d-2b60-41a2-ac06-fef7e718605a",
"metadata": {},
"source": [
"### A.2.2 向量的数据类型"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "3f48c014-e1a2-4a53-b5c5-125812d4034c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch.int64\n"
]
}
],
"source": [
"tensor1d = torch.tensor([1, 2, 3])\n",
"print(tensor1d.dtype)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "5429a086-9de2-4ac7-9f14-d087a7507394",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch.float32\n"
]
}
],
"source": [
"floatvec = torch.tensor([1.0, 2.0, 3.0])\n",
"print(floatvec.dtype)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "a9a438d1-49bb-481c-8442-7cc2bb3dd4af",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch.float32\n"
]
}
],
"source": [
"floatvec = tensor1d.to(torch.float32)\n",
"print(floatvec.dtype)"
]
},
{
"cell_type": "markdown",
"id": "2020deb5-aa02-4524-b311-c010f4ad27ff",
"metadata": {},
"source": [
"### A.2.3 PyTorch中常见的张量操作"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "c02095f2-8a48-4953-b3c9-5313d4362ce7",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1, 2, 3],\n",
" [4, 5, 6]])"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2d = torch.tensor([[1, 2, 3], [4, 5, 6]])\n",
"tensor2d"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "f33e1d45-5b2c-4afe-b4b2-66ac4099fd1a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"torch.Size([2, 3])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2d.shape # 张量形状"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "f3a4129d-f870-4e03-9c32-cd8521cb83fe",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1, 2],\n",
" [3, 4],\n",
" [5, 6]])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2d.reshape(3, 2) # 修改形状"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "589ac0a7-adc7-41f3-b721-155f580e9369",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1, 2],\n",
" [3, 4],\n",
" [5, 6]])"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2d.view(3, 2) # 查看张量"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "344e307f-ba5d-4f9a-a791-2c75a3d1417e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[1, 4],\n",
" [2, 5],\n",
" [3, 6]])"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2d.T # 转置张量"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "19a75030-6a41-4ca8-9aae-c507ae79225c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[14, 32],\n",
" [32, 77]])"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2d.matmul(tensor2d.T) # 张量乘法tensor2d与其转置相乘"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "e7c950bc-d640-4203-b210-3ac8932fe4d4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([[14, 32],\n",
" [32, 77]])"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"tensor2d @ tensor2d.T # 张量乘法的另一种实现方式tensor2d与其转置相乘"
]
},
{
"cell_type": "markdown",
"id": "4c15bdeb-78e2-4870-8a4f-a9f591666f38",
"metadata": {},
"source": [
"## A.3 把模型作为计算图"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "22af61e9-0443-4705-94d7-24c21add09c7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor(0.0852)\n"
]
}
],
"source": [
"import torch.nn.functional as F\n",
"\n",
"y = torch.tensor([1.0]) # 真实样本\n",
"x1 = torch.tensor([1.1]) # 输入特征\n",
"w1 = torch.tensor([2.2]) # 权重变量\n",
"b = torch.tensor([0.0]) # 偏置单元\n",
"\n",
"z = x1 * w1 + b # 网络输入\n",
"a = torch.sigmoid(z) # 激活函数 & 输出\n",
"\n",
"loss = F.binary_cross_entropy(a, y)\n",
"print(loss)"
]
},
{
"cell_type": "markdown",
"id": "f9424f26-2bac-47e7-b834-92ece802247c",
"metadata": {},
"source": [
"## A.4 自动求导"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "ebf5cef7-48d6-4d2a-8ab0-0fb10bdd7d1a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(tensor([-0.0898]),)\n",
"(tensor([-0.0817]),)\n"
]
}
],
"source": [
"import torch.nn.functional as F\n",
"from torch.autograd import grad\n",
"\n",
"y = torch.tensor([1.0])\n",
"x1 = torch.tensor([1.1])\n",
"w1 = torch.tensor([2.2], requires_grad=True)\n",
"b = torch.tensor([0.0], requires_grad=True)\n",
"\n",
"z = x1 * w1 + b \n",
"a = torch.sigmoid(z)\n",
"\n",
"loss = F.binary_cross_entropy(a, y)\n",
"\n",
"grad_L_w1 = grad(loss, w1, retain_graph=True)\n",
"grad_L_b = grad(loss, b, retain_graph=True)\n",
"\n",
"print(grad_L_w1)\n",
"print(grad_L_b)"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "93c5875d-f6b2-492c-b5ef-7e132f93a4e0",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([-0.0898])\n",
"tensor([-0.0817])\n"
]
}
],
"source": [
"loss.backward()# 反向传播\n",
"\n",
"print(w1.grad)\n",
"print(b.grad)"
]
},
{
"cell_type": "markdown",
"id": "f53bdd7d-44e6-40ab-8a5a-4eef74ef35dc",
"metadata": {},
"source": [
"## A.5 多层神经网络的实现"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "84b749e1-7768-4cfe-94d6-a08c7feff4a1",
"metadata": {},
"outputs": [],
"source": [
"class NeuralNetwork(torch.nn.Module):\n",
" def __init__(self, num_inputs, num_outputs):\n",
" super().__init__()\n",
"\n",
" self.layers = torch.nn.Sequential(\n",
" \n",
" # 第一个隐藏层\n",
" torch.nn.Linear(num_inputs, 30),\n",
" torch.nn.ReLU(),\n",
"\n",
" # 第二个隐藏层\n",
" torch.nn.Linear(30, 20),\n",
" torch.nn.ReLU(),\n",
"\n",
" # 输出层\n",
" torch.nn.Linear(20, num_outputs),\n",
" )\n",
"\n",
" def forward(self, x):\n",
" logits = self.layers(x)\n",
" return logits"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "c5b59e2e-1930-456d-93b9-f69263e3adbe",
"metadata": {},
"outputs": [],
"source": [
"model = NeuralNetwork(50, 3)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "39d02a21-33e7-4879-8fd2-d6309faf2f8d",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"NeuralNetwork(\n",
" (layers): Sequential(\n",
" (0): Linear(in_features=50, out_features=30, bias=True)\n",
" (1): ReLU()\n",
" (2): Linear(in_features=30, out_features=20, bias=True)\n",
" (3): ReLU()\n",
" (4): Linear(in_features=20, out_features=3, bias=True)\n",
" )\n",
")\n"
]
}
],
"source": [
"print(model)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "94535738-de02-4c2a-9b44-1cd186fa990a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Total number of trainable model parameters: 2213\n"
]
}
],
"source": [
"num_params = sum(p.numel() for p in model.parameters() if p.requires_grad)\n",
"print(\"Total number of trainable model parameters:\", num_params)# 打印训练模型的参数"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "2c394106-ad71-4ccb-a3c9-9b60af3fa748",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Parameter containing:\n",
"tensor([[-0.0064, 0.0004, -0.0903, ..., -0.1316, 0.0910, 0.0363],\n",
" [ 0.1354, 0.1124, -0.0476, ..., 0.0578, 0.1014, 0.0008],\n",
" [ 0.0975, -0.0478, 0.0298, ..., 0.0416, 0.0849, 0.1314],\n",
" ...,\n",
" [ 0.0118, 0.0240, 0.0420, ..., -0.1305, -0.0517, -0.0826],\n",
" [-0.0323, 0.1073, 0.0215, ..., -0.1264, -0.1100, 0.1232],\n",
" [ 0.0861, 0.0403, -0.0545, ..., 0.1352, 0.0817, -0.0938]],\n",
" requires_grad=True)\n"
]
}
],
"source": [
"print(model.layers[0].weight) # 打印神经网络模型的第一层的权重"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "b201882b-9285-4db9-bb63-43afe6a2ff9e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Parameter containing:\n",
"tensor([[-0.0577, 0.0047, -0.0702, ..., 0.0222, 0.1260, 0.0865],\n",
" [ 0.0502, 0.0307, 0.0333, ..., 0.0951, 0.1134, -0.0297],\n",
" [ 0.1077, -0.1108, 0.0122, ..., 0.0108, -0.1049, -0.1063],\n",
" ...,\n",
" [-0.0787, 0.1259, 0.0803, ..., 0.1218, 0.1303, -0.1351],\n",
" [ 0.1359, 0.0175, -0.0673, ..., 0.0674, 0.0676, 0.1058],\n",
" [ 0.0790, 0.1343, -0.0293, ..., 0.0344, -0.0971, -0.0509]],\n",
" requires_grad=True)\n"
]
}
],
"source": [
"# 设置随机数种子,以确保可复现性\n",
"torch.manual_seed(123)\n",
"\n",
"# 假设 NeuralNetwork 是一个神经网络类,且其构造函数接受两个参数,分别为输入特征的维度和输出特征的维度\n",
"model = NeuralNetwork(50, 3)\n",
"\n",
"# 打印神经网络模型的第一层的权重\n",
"print(model.layers[0].weight)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "1da9a35e-44f3-460c-90fe-304519736fd6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"torch.Size([30, 50])\n"
]
}
],
"source": [
"# 打印神经网络模型的第一层权重的形状\n",
"print(model.layers[0].weight.shape)"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "57eadbae-90fe-43a3-a33f-c23a095ba42a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[-0.1262, 0.1080, -0.1792]], grad_fn=<AddmmBackward0>)\n"
]
}
],
"source": [
"# 设置随机数种子,以确保可复现性\n",
"torch.manual_seed(123)\n",
"\n",
"# 模型输入特征的维度为 50\n",
"X = torch.rand((1, 50))\n",
"\n",
"# 使用模型进行前向传播计算输出\n",
"out = model(X)\n",
"\n",
"# 打印输出结果\n",
"print(out)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "48d720cb-ef73-4b7b-92e0-8198a072defd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[-0.1262, 0.1080, -0.1792]])\n"
]
}
],
"source": [
"# 使用 torch.no_grad() 上下文管理器,以便在推断时不计算梯度\n",
"with torch.no_grad():\n",
" out = model(X)\n",
"print(out)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "10df3640-83c3-4061-a74d-08f07a5cc6ac",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[0.3113, 0.3934, 0.2952]])\n"
]
}
],
"source": [
"with torch.no_grad():\n",
" out = torch.softmax(model(X), dim=1)\n",
"print(out)"
]
},
{
"cell_type": "markdown",
"id": "19858180-0f26-43a8-b2c3-7ed40abf9f85",
"metadata": {},
"source": [
"## A.6 建立高效的数据加载器"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "b9dc2745-8be8-4344-80ef-325f02cda7b7",
"metadata": {},
"outputs": [],
"source": [
"# 定义输入特征张量 X_train\n",
"X_train = torch.tensor([\n",
" [-1.2, 3.1],\n",
" [-0.9, 2.9],\n",
" [-0.5, 2.6],\n",
" [2.3, -1.1],\n",
" [2.7, -1.5]\n",
"])\n",
"\n",
"# 定义对应的标签张量 y_train\n",
"y_train = torch.tensor([0, 0, 0, 1, 1])"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "88283948-5fca-461a-98a1-788b6be191d5",
"metadata": {},
"outputs": [],
"source": [
"X_test = torch.tensor([\n",
" [-0.8, 2.8],\n",
" [2.6, -1.6],\n",
"])\n",
"\n",
"y_test = torch.tensor([0, 1])"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "edf323e2-1789-41a0-8e44-f3cab16e5f5d",
"metadata": {},
"outputs": [],
"source": [
"from torch.utils.data import Dataset\n",
"\n",
"\n",
"class ToyDataset(Dataset):\n",
" # 初始化 ToyDataset 类\n",
" def __init__(self, X, y):\n",
" self.features = X\n",
" self.labels = y\n",
" # 获取指定索引的数据\n",
" def __getitem__(self, index):\n",
" one_x = self.features[index]\n",
" one_y = self.labels[index] \n",
" return one_x, one_y\n",
" # 获取数据集的长度\n",
" def __len__(self):\n",
" return self.labels.shape[0]\n",
"# 创建训练数据集和测试数据集实例\n",
"train_ds = ToyDataset(X_train, y_train)\n",
"test_ds = ToyDataset(X_test, y_test)"
]
},
{
"cell_type": "code",
"execution_count": 32,
"id": "b7014705-1fdc-4f72-b892-d8db8bebc331",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"5"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(train_ds)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"id": "3ec6627a-4c3f-481a-b794-d2131be95eaf",
"metadata": {},
"outputs": [],
"source": [
"from torch.utils.data import DataLoader\n",
"\n",
"torch.manual_seed(123)\n",
"\n",
"# 创建训练数据加载器 train_loader\n",
"# dataset 参数传入了您定义的 ToyDataset 类的实例 train_ds\n",
"# batch_size 参数指定了每个批次包含的样本数量\n",
"# shuffle 参数指定是否在每个 epoch 之前对数据进行洗牌\n",
"# num_workers 参数指定用于数据加载的子进程数量\n",
"train_loader = DataLoader(\n",
" dataset=train_ds,\n",
" batch_size=2,\n",
" shuffle=True,\n",
" num_workers=0\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 34,
"id": "8c9446de-5e4b-44fa-bf9a-a63e2661027e",
"metadata": {},
"outputs": [],
"source": [
"test_ds = ToyDataset(X_test, y_test)\n",
"\n",
"# 创建测试数据加载器 test_loader\n",
"# dataset 参数传入了您定义的 ToyDataset 类的实例 test_ds\n",
"# batch_size 参数指定了每个批次包含的样本数量\n",
"# shuffle 参数指定是否在每个 epoch 之前对数据进行洗牌,这里设为 False 表示不洗牌\n",
"# num_workers 参数指定用于数据加载的子进程数量\n",
"test_loader = DataLoader(\n",
" dataset=test_ds,\n",
" batch_size=2,\n",
" shuffle=False,\n",
" num_workers=0\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 35,
"id": "99d4404c-9884-419f-979c-f659742d86ef",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Batch 1: tensor([[ 2.3000, -1.1000],\n",
" [-0.9000, 2.9000]]) tensor([1, 0])\n",
"Batch 2: tensor([[-1.2000, 3.1000],\n",
" [-0.5000, 2.6000]]) tensor([0, 0])\n",
"Batch 3: tensor([[ 2.7000, -1.5000]]) tensor([1])\n"
]
}
],
"source": [
"# 迭代训练数据加载器 train_loader\n",
"for idx, (x, y) in enumerate(train_loader):\n",
" # 打印每个批次的索引、输入特征和对应的标签\n",
" print(f\"Batch {idx+1}:\", x, y)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "9d003f7e-7a80-40bf-a7fb-7a0d7dbba9db",
"metadata": {},
"outputs": [],
"source": [
"train_loader = DataLoader(\n",
" dataset=train_ds,\n",
" batch_size=2,\n",
" shuffle=True,\n",
" num_workers=0,\n",
" drop_last=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4db4d7f4-82da-44a4-b94e-ee04665d9c3c",
"metadata": {},
"outputs": [],
"source": [
"for idx, (x, y) in enumerate(train_loader):\n",
" print(f\"Batch {idx+1}:\", x, y)"
]
},
{
"cell_type": "markdown",
"id": "d904ca82-e50f-4f3d-a3ac-fc6ca53dd00e",
"metadata": {},
"source": [
"## A.7 一个示例训练轮次"
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "93f1791a-d887-4fc5-a307-5e5bde9e06f6",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Epoch: 001/003 | Batch 000/002 | Train/Val Loss: 0.75\n",
"Epoch: 001/003 | Batch 001/002 | Train/Val Loss: 0.65\n",
"Epoch: 002/003 | Batch 000/002 | Train/Val Loss: 0.44\n",
"Epoch: 002/003 | Batch 001/002 | Train/Val Loss: 0.13\n",
"Epoch: 003/003 | Batch 000/002 | Train/Val Loss: 0.03\n",
"Epoch: 003/003 | Batch 001/002 | Train/Val Loss: 0.00\n"
]
}
],
"source": [
"import torch.nn.functional as F\n",
"\n",
"\n",
"torch.manual_seed(123)\n",
"model = NeuralNetwork(num_inputs=2, num_outputs=2)\n",
"optimizer = torch.optim.SGD(model.parameters(), lr=0.5)\n",
"\n",
"num_epochs = 3\n",
"\n",
"for epoch in range(num_epochs):\n",
" \n",
" model.train()\n",
" for batch_idx, (features, labels) in enumerate(train_loader):\n",
"\n",
" logits = model(features)\n",
" \n",
" loss = F.cross_entropy(logits, labels) # 损失函数\n",
" \n",
" optimizer.zero_grad()\n",
" loss.backward()\n",
" optimizer.step()\n",
" \n",
" ### 日志\n",
" print(f\"Epoch: {epoch+1:03d}/{num_epochs:03d}\"\n",
" f\" | Batch {batch_idx:03d}/{len(train_loader):03d}\"\n",
" f\" | Train/Val Loss: {loss:.2f}\")\n",
"\n",
" model.eval()\n",
" # 可选的模型评估指标"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "00dcf57f-6a7e-4af7-aa5a-df2cb0866fa5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[ 2.8569, -4.1618],\n",
" [ 2.5382, -3.7548],\n",
" [ 2.0944, -3.1820],\n",
" [-1.4814, 1.4816],\n",
" [-1.7176, 1.7342]])\n"
]
}
],
"source": [
"model.eval()\n",
"\n",
"with torch.no_grad():\n",
" outputs = model(X_train)\n",
"\n",
"print(outputs)"
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "19be7390-18b8-43f9-9841-d7fb1919f6fd",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([[ 0.9991, 0.0009],\n",
" [ 0.9982, 0.0018],\n",
" [ 0.9949, 0.0051],\n",
" [ 0.0491, 0.9509],\n",
" [ 0.0307, 0.9693]])\n",
"tensor([0, 0, 0, 1, 1])\n"
]
}
],
"source": [
"# 设置 PyTorch 的打印选项,以关闭科学计数法\n",
"torch.set_printoptions(sci_mode=False)\n",
"\n",
"# 假设 outputs 是模型的输出张量\n",
"\n",
"# 对模型的输出进行 softmax 操作,计算类别概率\n",
"probas = torch.softmax(outputs, dim=1)\n",
"print(probas)\n",
"\n",
"# 获取模型的预测结果,即具有最大概率的类别\n",
"predictions = torch.argmax(outputs, dim=1)\n",
"print(predictions)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "07e7e530-f8d3-429c-9f5e-cf8078078c0e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"tensor([0, 0, 0, 1, 1])\n"
]
}
],
"source": [
"# 使用 torch.argmax() 函数沿着 dim=1 维度获取每个样本最大值的索引,即模型的预测结果\n",
"predictions = torch.argmax(outputs, dim=1)\n",
"print(predictions)"
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "5f756f0d-63c8-41b5-a5d8-01baa847e026",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor([True, True, True, True, True])"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"predictions == y_train"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "da274bb0-f11c-4c81-a880-7a031fbf2943",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"tensor(5)"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"torch.sum(predictions == y_train)"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "16d62314-8dee-45b0-8f55-9e5aae2b24f4",
"metadata": {},
"outputs": [],
"source": [
"def compute_accuracy(model, dataloader):\n",
" \"\"\"\n",
" 计算模型在给定数据加载器上的准确率。\n",
"\n",
" 参数:\n",
" model (torch.nn.Module): 待评估的模型。\n",
" dataloader (torch.utils.data.DataLoader): 包含输入数据的数据加载器。\n",
"\n",
" 返回:\n",
" float: 准确率值。\n",
" \"\"\"\n",
" # 将模型设为评估模式\n",
" model = model.eval()\n",
" correct = 0.0\n",
" total_examples = 0\n",
" \n",
" # 遍历数据加载器\n",
" for idx, (features, labels) in enumerate(dataloader):\n",
" \n",
" # 使用 no_grad 上下文,以便不跟踪梯度\n",
" with torch.no_grad():\n",
" # 使用模型进行前向传播获取预测结果\n",
" logits = model(features)\n",
" \n",
" # 获取预测结果并计算正确预测的数量\n",
" predictions = torch.argmax(logits, dim=1)\n",
" compare = labels == predictions\n",
" correct += torch.sum(compare)\n",
" total_examples += len(compare)\n",
"\n",
" # 计算并返回准确率\n",
" return (correct / total_examples).item()"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "4f6c9c17-2a5f-46c0-804b-873f169b729a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.0"
]
},
"execution_count": 45,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"compute_accuracy(model, train_loader)"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "311ed864-e21e-4aac-97c7-c6086caef27a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1.0"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"compute_accuracy(model, test_loader)"
]
},
{
"cell_type": "markdown",
"id": "4d5cd469-3a45-4394-944b-3ce543f41dac",
"metadata": {},
"source": [
"## A.8 保存并加载模型"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "b013127d-a2c3-4b04-9fb3-a6a7c88d83c5",
"metadata": {},
"outputs": [],
"source": [
"torch.save(model.state_dict(), \"model.pth\")"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "b2b428c2-3a44-4d91-97c4-8298cf2b51eb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<All keys matched successfully>"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model = NeuralNetwork(2, 2) # 需要与原始模型完全匹配\n",
"model.load_state_dict(torch.load(\"model.pth\"))"
]
},
{
"cell_type": "markdown",
"id": "f891c013-43da-4a05-973d-997be313d2d8",
"metadata": {},
"source": [
"## A.9 使用GPU来优化训练性能"
]
},
{
"cell_type": "markdown",
"id": "e68ae888-cabf-49c9-bad6-ecdce774db57",
"metadata": {},
"source": [
"### A.9.1 在GPU上进行 PyTorch 的运算"
]
},
{
"cell_type": "markdown",
"id": "141c845f-efe3-4614-b376-b8b7a9a2c887",
"metadata": {},
"source": [
"See [code-part2.ipynb](code-part2.ipynb) "
]
},
{
"cell_type": "markdown",
"id": "99811829-b817-42ea-b03e-d35374debcc0",
"metadata": {},
"source": [
"### A.9.2 单个GPU的训练"
]
},
{
"cell_type": "markdown",
"id": "0b21456c-4af7-440f-9e78-37770277b5bc",
"metadata": {},
"source": [
"See [code-part2.ipynb](code-part2.ipynb)"
]
},
{
"cell_type": "markdown",
"id": "db6eb2d1-a341-4489-b04b-635c26945333",
"metadata": {},
"source": [
"### A.9.3 多GPU的训练"
]
},
{
"cell_type": "markdown",
"id": "9d049a81-5fb0-49b5-9d6a-17a9976d8520",
"metadata": {},
"source": [
"See [DDP-script.py](DDP-script.py)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b32db94f-f159-4aa3-91cc-5b937eef7fc7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}