深入torch框架内部

torch框架下的函数都进行了封装，使初学者很难清楚内部的数据形式到底是什么，所以，经过了torch框架的练习，这篇文章来解析一下torch框架下的封装函数。

我们打印

print(torch.nn)

print(torch.optim)

print(torch.cuda)

输出结果：

<module 'torch.nn' from 'D:\\Anaconda\\envs\\pytorch\\lib\\site-packages\\torch\\nn\\__init__.py'
<module'torch.optim'from'D:\\Anaconda\\envs\\pytorch\\lib\\sitepackages\\torch\\optim\\__init__.py
<module'torch.cuda'from'D:\\Anaconda\\envs\\pytorch\\lib\\sitepackages\\torch\\cuda\\__init__.py'

nn,optim,cuda均在torch框架下的目录中，函数或类就在目录下的文件中。

1，学习率衰减函数：

  #学习率衰减
    scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5,last_epoch=-1)

参数含义：

step_size:每经过多少个epoch对学习率进行调整，此时的scheduler.step()放在epoch循环下。

gamma:经过step_size个epoch后，学习率变成:lr * gamma。

last_epoch:last_epoch之后恢复lr为initial_lr(如果是训练了很多个epoch后中断了继续训练这个值就等于加载的模型的epoch 默认为-1表示从头开始训练，即从epoch=1开始。

scheduler 对象常用的几个属性：

scheduler.step():对学习率进行更新，放在epoch下，优化器的更新之后。

scheduler.dict():状态信息，字典格式

    print(scheduler.state_dict())
    # {'step_size': 20, 'gamma': 0.5, 'base_lrs': [0.01], 'last_epoch': 0, '_step_count': 1, 'verbose': False,
    #  '_get_lr_called_within_step': False, '_last_lr': [0.01]}

查看打印每个epoch的学习率：

可用优化器的属性state_dict():optimizer.state_dict()['param_groups'][0]['lr']

state_dict()下的内容：

  for k,v in state_dict.items():
                print(k)
                #state
                #param_groups

'param_groups': [{'lr': 0.01, 'momentum': 0.9, 'dampening': 0, 'weight_decay': 0.0002, 'nesterov': False, 'initial_lr': 0.01, 'params': [0, 1,

2，DataLoader：

数据加载器，当训练时，每一个epoch,就是从DataLoader中获取一个batch_size大小的数据。

 for index, (input, target) in enumerate(train_loader,0):

            model.train()

            input = Variable(input).cuda()

            target = Variable(target).cuda()

            output = model(input)

            loss = criterion(output, target)

            optimizer.zero_grad()

            loss.backward()

            optimizer.step()

model.train():开始启用batch_normalization 和 drop_out

model.eval():会使用batch_normalization 不只用drop_out

.cuda() :模型和相应的数据进行 .cuda()处理，可以将内存中的数据复制（迁移）到GPU的显存中。从而通过GPU来进行运算。

    model = Model.get_net()
    if torch.cuda.is_available():
        model = model.cuda()

对数据的迁移：

数据方面常用的有两种：Tensor 和 Variable 。实际中这两种类型是同一个东西，因为Variable实际上只是一个容器。

一，将Tensor迁移到显存中：

import torch
a = torch.FloatTensor(2)
print(a)
b = a.cuda()
print(b)
c = b.cpu()
print(c)


#tensor([0., 0.])
#tensor([0., 0.], device='cuda:0')
#tensor([0., 0.])

如果要将显存中的数据复制到内存中，则对cuda数据类型使用.cpu()方法即可。

二，将Variable迁移到显存中

常用Variable这个容器来装载数据。主要是Variable可以进行反向传播进行自动求导。

同样的，要将Variable迁移到显存中，只需要使用.cuda()即可实现。

对Variable直接使用.cuda() 和对Tenso先r进行.cuda()然后再放置到Variable中的结果一致。

import torch
from torch.autograd import Variable
a = torch.FloatTensor(2)
b = Variable(a).cuda()
print(b)
c = a.cuda()
d = Variable(c)
print(d)

#tensor([0., 0.], device='cuda:0')
#tensor([0., 0.], device='cuda:0')

.cuda()操作默认使用GPU 0也就是第一张显卡来进行操作，当要存储到其他的显卡时可以使用

.cuda(显卡卡号)来将数据存储到指定的显卡中。

注意：

对于不同存储位置的变量，不可以对他们进行计算。

不断的更新完善。