nlp

[DL Wizard] Recurrent Neural Network with PyTorch 번역 및 정리

codlingual 2020. 2. 8. 00:30
반응형

https://www.deeplearningwizard.com/deep_learning/practical_pytorch/pytorch_recurrent_neuralnetwork/

 

Recurrent Neural Networks (RNN) - Deep Learning Wizard

Recurrent Neural Network with PyTorch About Recurrent Neural Network Feedforward Neural Networks Transition to 1 Layer Recurrent Neural Networks (RNN) RNN is essentially an FNN but with a hidden layer (non-linear output) that passes on information to the n

www.deeplearningwizard.com

 

Model A : 1 Hidden Layer (ReLU)

- 28번 반복 

- 1 Hidden layer

- 활성화 함수 : ReLU

- 입력 데이터는 MNIST 사진 데이터 [1*28*28]

cf. RNN input = (1,28) / CNN input = (1,28,28) / FNN input = (1,28*28)

* RNN은 28번 반복하니까 첫 input 28 * 28번 반복 = 28*28 사이즈 사진 데이터 처리 가능

- 배치 사이즈는 100

 

 

 

Build Model

 

class RNNModel(nn.Module):
    def __init__(self, input_dim, hidden_dim, layer_dim, output_dim):
        super(RNNModel, self).__init__()
        # Hidden dimensions
        self.hidden_dim = hidden_dim
        # hidden layers 개수
        self.layer_dim = layer_dim
        
        # batch_first=True : input/output 텐서가 (batch_dim, seq_dim, input/output_dim) 형태로 됨
        # output이 input되고 input이 output 되니까 'input/output'이라 표기함
        # batch_dim = batch_size
        # nonlinearity = 'tanh' 으로도 변경 가능
        self.rnn = nn.RNN(input_dim, hidden_dim, layer_dim, batch_first=True, nonlinearity='relu')
        
        # Readout layer
        self.fc = nn.Linear(hidden_dim, output_dim)
        
    def forward(self,x):
        # 은닉 상태를 0으로 초기화 
        # (layer_dim, batch_Size, hidden_dim)
        # h0 = torch.zeros(1, 28, 100)
        h0 = torch.zeros(self.layer_dim, x.size(0), self.hidden_dim).requires_grad_()
        
        # BPTT (truncated backpropagation through time)
        # 기울기 소실/폭발 방지하기 위해
        out, hn = self.rnn(x, h0.detach())
        
        # out.size() -> 100, 28, 10 (28은 28번 돌았다는 뜻, seq_dim)
        # out[ : , -1, : ] -> 100, 10 -> 마지막 출력만(=마지막 은닉상태) 필요
        # RNN의 마지막 출력 FC 레이어에 넣기
        out = self.fc(out[ : , -1, : ])
        
        # out.size() -> 100, 10
        return out

 

Instantiate Model Class

 

input_dim = 28
hidden_dim = 100
layer_dim = 1 # 은닉층 개수 2 등으로 늘릴 수 있음
output_dim = 10 
seq_dim = 28 # 반복 횟수

model = RNNModel(input_dim, hidden_dim, layer_dim, output_dim)

 

Parameters 살펴보기

•input -> hidden (affine) : A1, B1

•hidden -> (다음 반복 횟수의) hidden (affine) : A2, B2

•hidden -> output (affine) : A3, B3

 

• 전체 파라미터 개수 = 6개 

len(list(model.parameters())) # 6

 

•input -> hidden (affine) 

list(model.parameters())[0].size() # A1 [100,28]

list(model.parameters())[2].size() # B1 [100]

→ A1 [100,28] * input(x) [28,1] + B1 [100,1] = [100,1] 

 

•hidden -> (다음 반복 횟수의) hidden (affine)

list(model.parameters())[1].size() # A2 [100, 100]

list(model.parameters())[3].size() # B2 [100]

A2 [100,100] * 저번 반복의 hidden 출력 [100,1] + B2 [100,1] = [100,1]

 

•hidden -> output (affine)

list(model.parameters())[4].size() # A3 [10,100]

list(model.parameters())[5].size() # B3 [10]

→ A3 [10,100] * 마지막 반복의 hidden 출력 [100,1] + B3 [10] = [10,1]

 

 

Instantiate Loss Class

- RNN : Cross Entropy Loss

- CNN : Cross Entropy Loss

- FNN : Cross Entropy Loss

- Logistic Regression : Cross Entropy Loss

- Linear Regression : MSE

 

criterion = nn.CrossEntropyLoss()

 

Instantiate Optimizer Class

 

learning_rate = 0.01
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

 

Train/Test Model

 

images = images.view(-1, seq_dim, input_dim).requires_grad_() # [100,28,28]

images = images.view(-1, seq_dim, input_dim)

 

seq_dim = 28

iter = 0
for epoch in range(num_epochs):
    for i, (images,labels) in enumerate(train_loader):
        model.train()
        
        # Load data 
        # batch_first=True : input/output 텐서가 (batch_dim, seq_dim, input/output_dim) 형태
        images = images.view(-1, seq_dim, input_dim).requires_grad_()
        
        # Clear gradients
        optimizer.zero_grad()
        
        outputs = model(images)
        
        loss = criterion(outputs, labels)
        
        loss.backward()
        
        optimizer.step()
        
        iter+=1
        
        if iter%500==0:
            model.eval()
            
            correct=0
            total=0
            
            for images, labels in test_loader:
                images = images.view(-1, seq_dim, input_dim)
                
                outputs = model(images)
                
                _, predicted = torch.max(outputs.data, 1)
                
                total+=labels.size(0)
                correct+= (predicted==labels).sum()
                
            accuracy = 100 * correct/total
            
            print('Iteration: {}. Loss: {}. Accuracy: {}'.format(iter, loss.item(), accuracy))
            
반응형