[PyTorch Tutorials] 파이토치 튜터리얼 정리/번역

2020. 2. 3. 14:53nlp

반응형

https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html#sphx-glr-beginner-blitz-tensor-tutorial-py

 

What is PyTorch? — PyTorch Tutorials 1.4.0 documentation

Note Click here to download the full example code What is PyTorch? It’s a Python-based scientific computing package targeted at two sets of audiences: A replacement for NumPy to use the power of GPUs a deep learning research platform that provides maximum

pytorch.org

 

tensor

- numpy의 array와 유사

- GPU 이용해서 빠른 계산 가능

 

tensor 만들기

1) uninitialized  : torch.empty(5,3) # size 

2) randomly initialized : torch.rand(5,3)

3) 0으로 된 텐서 : torch.zeros(5,3, dtype=torch.long) 

4) 데이터 직접 입력 : torch.tensor([1,2])

5) 기존 텐서 이용 : 

 

# 1로만 이루어진 5,3 크기의 텐서 
x = x.new_ones(5,3,dtype=torch.double) 

# x와 같은 사이즈이지만 random value 가짐 
# 명시 안했다면 dtype도 x와 같은 dtype 가졌을 것
y = torch.randn_like(x,dtype=torch.float) 

 

tensor 크기 확인

- x.size()

 

tensor 더하기 

1) x+y

2) torch.add(x,y)

3) 기존 tensor에 다른 tensor 더하기 (in-place addition) : y.add_(x)

4) 결과물이 들어갈 빈 tensor 미리 만들어 놓기 

 

result=torch.empty(5,3)
torch.add(x,y, out=result)
print(result)

 

tensor indexing

- numpy와 같은 방법 

- tensor의 2번째 세로줄만 꺼내기 : x[:, 1]

 

tensor resize/reshape

- torch.view 

* -1 은 나머지 차원 수 보고 알맞게 알아서 설정하라는 의미 

x=torch.randn(4,4) # size(4,4)
y=x.view(16) # size(16)
z=x.view(-1,8) # size(2,8) 

 

tensor 원소가 하나인 경우

- .item() 이용해서 꺼내기 

 

x=torch.randn(1)
x.item() 

 

tensor와 numpy array

- tensor와 numpy array는 메모리를 공유하고 있어서 하나 수정하면 다른 것도 저절로 바뀜 

 

1) tensor를 numpy array로 : .numpy()

a = torch.ones(5) # tensor([1,1,1,1,1])
b = a.numpy() # [1,1,1,1,1]

a.add_(1) 
print(a) # tensor([2,2,2,2,2])
print(b) # [2,2,2,2,2]

 

2) numpy array를 tensor로 : .from_numpy()

 

import numpy as np 
a = np.ones(5) # [1,1,1,1,1]
b = torch.from_numpy(a) # tensor([1,1,1,1,1])
np.add(a, 1, out=a)
print(a) # [2,2,2,2,2]
print(b) # tensor([2,2,2,2,2])

 

 

https://pytorch.org/tutorials/beginner/blitz/autograd_tutorial.html

 

Autograd: Automatic Differentiation — PyTorch Tutorials 1.4.0 documentation

Note Click here to download the full example code Autograd: Automatic Differentiation Central to all neural networks in PyTorch is the autograd package. Let’s first briefly visit this, and we will then go to training our first neural network. The autograd

pytorch.org

 

Autograd : Automatic Differentiation 

- 해당 텐서에 대한 계산 모두 tracking해서 기울기 구해주기 : requires_grad=True 

- gradient function만 출력하고 싶은 경우 : .grad_fn

- 중간에 requires_grad 넣어주고 싶다면(in-place) : .requires_grad_(True)

- 중간에 requires_grad 그만하고 싶다면 : with torch.no_grad() 

 

x = torch.ones(2,2, requires_grad=True) # 기울기 자동으로 계산하기 

y = x+2 
print(y) 

<결과>

tensor([[3., 3.],
        [3., 3.]], grad_fn=<AddBackward0>) * 덧셈의 미분

 

z = y*y*3
out = z.mean()
print(z,out)

 

<결과>

tensor([[27., 27.], [27., 27.]], grad_fn=<MulBackward0>) tensor(27., grad_fn=<MeanBackward0>)

 

a = torch.randn(2,2)
print(a.requires_grad) # (default) False

a.requires_grad_(True)
print(a.requires_grad) # True

b = (a*a).sum()
print(b.grad_fn) # <SumBackward0 object at 0x7f0e12b45320>

 

print(x.requires_grad) # T
print((x**2).requires_grad) # T

with torch.no_grad():
   print((x**2).requires_grad) # F

 

Gradients

오차역전파 구현하기

1) .backward()

2) .grad

 

x = torch.ones(2,2, requires_grad=True)

y = x+2

z = y*y*3

out = z.mean() # out은 single scalar (숫자 한개) 

out.backward() # out.backward(torch.tensor(1)) 와 같은데 single scalar라서 생략 가능 

print(x.grad)

 

<결과> 

tensor([[4.5000, 4.5000], [4.5000, 4.5000]])

 

<풀이>

out = 1/4 * ∑z_i 

z_i = y*y*3 = (x+2) * (x+2) * 3 = 3(x+2)^2 

1/4 * 6(x+2) = 3/2(x+2) 

모든 x 값이 1 이니까 모든 원소가 3/2 * 3 = 9/2 = 4.5

 

x = torch.randn(3, requires_grad=True) # 1차원 3개 숫자 나열 

y = x*2

# y.data.norm() : y의 각 원소를 제곱한 후 더한 것의 제곱근  
while y.data.norm() < 1000:
    y = y*2 
    
gradients = torch.tensor([0.1, 1.0, 0.0001], dtype=torch.float)
y.backward(gradients) # 이번에 y는 single scalar가 아니니까 backward 함수에 argument 넣어줘야 함

print(x.grad) # tensor([5.1200e+01, 5.1200e+02, 5.1200e-02])
    

 

 

https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html

 

Neural Networks — PyTorch Tutorials 1.4.0 documentation

Note Click here to download the full example code Neural Networks Neural networks can be constructed using the torch.nn package. Now that you had a glimpse of autograd, nn depends on autograd to define models and differentiate them. An nn.Module contains l

pytorch.org

 

Neural Networks (=nn)

학습 과정

1) 학습할 parameter/weight가 있는 neural network 만들기

2) 여러 데이터 넣어 결과 확인하기 

3) 오차 계산하기 (neural network의 결과 vs 정답)

4) 오차역전파 

5) 가중치 갱신 : weight = weight - learning_rate*gradient 

 

Network 만들기 

 

input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d
      -> view -> linear -> relu -> linear -> relu -> linear
      -> MSELoss
      -> loss

 

import torch
import torch.nn as nn
import torch.mm.functional as F

class Net(nn.Module):

   def __init__(self):
      super(Net, self).__init__()
      
      # 첫번째 층, Conv2d(in_channels, out_channels, kernel_size)
      # 입력 데이터의 채널은 1개, 출력 데이터의 채널은 6개, 커널(필터) 사이즈는 5*5
      self.conv1 = nn.Conv2d(1,6,5) 
      self.conv2 = nn.Conv2d(6,16,5)
      
      # affine 계층 : y = Wx + b
      # Linear(in_features, out_features)
      self.fc1 = nn.Linear(16*5*5, 120) 
      self.fc2 = nn.Linear(120, 84)
      self.fc3 = nn.Linear(84,10)
      
   def forward(self,x):
      # conv1 -> relu -> max pooling (window_size=2*2)
      x = F.max_pool2d(F.relu(self.conv1(x)), (2,2))
      
      # max pooling의 window_size가 제곱수라면 숫자 하나만 써도 됨 
      # 이 경우 max pooling window_size=2*2
      x = F.max_pool2d(F.relu(self.conv2(x), 2)
      
      x = x.view(-1, self.num_flat_features(x)) # reshape 
      
      # fc1 -> relu
      x=F.relu(self.fc1(x))
      
      x=F.relu(self.fc2(x))
      x=self.fc3(x)
      
      return x
     
   def num_flat_features(self, x):
      # x.size() : (batch_size, 행, 열)
      # x.size()[1:] : (행, 열)
      # size = batch dimension만 제외한 모든 차원(사이즈)
      size = x.size()[1:]
      num_features=1
      
      for s in size:
        num_features*=s
      return num_features
      
net = Net()
print(net)
      
      

 

<결과>

Net(

(conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))

(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))

(fc1): Linear(in_features=400, out_features=120, bias=True)

(fc2): Linear(in_features=120, out_features=84, bias=True)

(fc3): Linear(in_features=84, out_features=10, bias=True) )

 

parameters 확인하기

 

params = list(net.parameters())
print(len(params)) # 10
print(params[0].size()) # 첫번째 층(conv1) weight의 크기, [6,1,5,5]  

 

(conv1): torch.Size([6, 1, 5, 5])

(??): torch.Size([6])

(conv2): torch.Size([16, 6, 5, 5])

(??): torch.Size([16])

(fc1): torch.Size([120, 400])

(bias): torch.Size([120])

(fc2): torch.Size([84, 120])

(bias): torch.Size([84])

(fc3) : torch.Size([10, 84])

(bias): torch.Size([10])

 

데이터 입력해보기

 

input = torch.randn(1,1,32,32) 
out = net(input)
print(out) # out.size() = 10 (1*10)

 

 

 

net.zero_grad() # 기울기(gradient) 0으로 리셋 (리셋 안하면 축적되서 이상한 결과 나옴)
out.backward(torch.randn(1,10)) # 랜덤한 기울기 값 줘서 오차역전파하기

 

오차 함수 Loss Function

 

output = net(input)
target = torch.randn(10) # dummy target
target = target.view(1,-1) # 위의 dummy target을 가로로 1줄로 나열되도록 reshape 
criterion = nn.MSELoss() # MSE : Mean Squared Errors

loss = criterion(output, target)

 

각 단계의 gradients 구하기

 

print(loss.grad_fn) # MSELoss 
print(loss.grad_fn.next_functions[0][0]) # Linear(fc3)
print(loss.grad_fn.next_functions[0][0].next_functions[0][0]) # ReLU

 

오차역전파

 

net.zero_grad() # gradients 모두 0으로 리셋 

# conv1.bias.grad before backward 
print(net.conv1.bias.grad) # tensor([0., 0., 0., 0., 0., 0.])

loss.backward()

# conv1.bias.grad after backward
print(net.conv1.bias.grad) # tensor([ 0.0059, -0.0039, -0.0022, -0.0094, -0.0209,  0.0144])

 

가중치 갱신하기

 

import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

optimizer.zero_grad() # gradients 0으로 리셋
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step() # 업데이트 

 

 

https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html#sphx-glr-beginner-blitz-cifar10-tutorial-py

 

Training a Classifier — PyTorch Tutorials 1.4.0 documentation

Note Click here to download the full example code Training a Classifier This is it. You have seen how to define neural networks, compute loss and make updates to the weights of the network. Now you might be thinking, What about data? Generally, when you ha

pytorch.org

 

CIFAR-10 데이터 분류하기 

입력 데이터 : 3*32*32 

→ 3 channels(color), image size 32*32 pixels

 

1) torchvision으로 CIFAR10 학습+테스트 데이터 불러오고 정규화(normalize) 

2) CNN 만들기

3) 오차 함수 정의하기 

4) 학습 데이터로 학습시키기 

5) 테스트 데이터로 테스트하기 

 

 

CIFAR10 데이터 불러오기

 

import torch
import torchvision
import torchvision.transforms as transforms

transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)

classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

 

* transforms.Normalize()

Normalize does the following for each channel:

image = (image - mean) / std

The parameters mean, std are passed as 0.5, 0.5 in your case. This will normalize the image in the range [-1,1].

For example, the minimum value 0 will be converted to (0-0.5)/0.5=-1, the maximum value of 1 will be converted to (1-0.5)/0.5=1.

(출처: https://discuss.pytorch.org/t/understanding-transform-normalize/21730)

 

 

CNN 만들기

 

import torch.nn as nn
import torch.nn.functional as F


class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x


net = Net()

 

오차 함수와 Optimizer 만들기 

 

import torch.optim as optim

criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

 

학습시키기

 

for epoch in range(2):
  running_loss=0.0
  for i, data in enumerate(trainloader,0):
     inputs, labels = data 
     optimizer.zero_grad()
     
     # forward + backward + optimize
     outputs = net(inputs)
     loss = criterion(outputs, labels)
     loss.backward()
     optimizer.step()
     
     # print stats 
     running_loss += loss.item()
     if i%2000 == 1999: # print every 2000 mini-batches
        print('[%d, %5d] loss: %.3f' % (epoch+1, i+1, running_loss/2000))
        running_loss = 0.0
        
   print('Finished Training')

 

테스트 하기

 

correct = 0
total = 0

with torch.no_grad():
   for data in testloader:
      images, labels = data 
      outputs = net(images)
      
      # 각 가로줄에서 최댓값만 선택
      # _ 에는 최댓값 그 자체(확률), predicted 에는 최댓값을 가진 index
      _, predicted = torch.max(outputs.data, 1)  
      
      total+= labels.size(0)
      correct+= (predicted==labels).sum().item()
      
print('Accuracy of the network on the 10000 test images : %d %%' % (100*correct/total))
반응형