from miniai.datasets import *
from miniai.conv import *
from miniai.conv import *
Hugging Face Features
In lesson 15 of the Practical Deep Learning For Coders, we used Hugging Face Datasets to download Fasion MNIST data and trained our model. I faced a problem here because I could not fit my model fast enough. Even using all my CPU cores, it was not as fast as Jeremy’s computer. Even on Google Colab, CPU was not strong. It was not a huge problem, but it was annoying, so I decided to find a solution.
In the lesson, we downloaded images and applied a transform function to convert them into tensors. With dsd.with_transform
, the transform happened every batch and it took the most of the time. We don’t have to apply transform every batch. So, let’s find a way to only do it once.
Initially, I just wanted to convert images into tensors with map
, but Hugging Face used Apache Arrow, which does not support tensors
type. So, I used Hugging Face Features
and Array2D
to fix this problem.
Here is the orginal approach that takes a long time from the course.
Original approach
from datasets import load_dataset,load_dataset_builder
import torch
import torchvision.transforms.functional as TF
from torch import optim, nn,tensor
import torch.nn.functional as F
import fastcore.all as fc
import logging
logging.disable(logging.WARNING)
from tqdm import tqdm
First, we grab the data from Hugging Face.
= 'image','label'
x,y = "fashion_mnist"
name = load_dataset(name) dsd
Here is a inplace transform function. This function is applied every batch and converts images into tensors with the right shape.
@inplace
def transformi(b): b[x] = [torch.flatten(TF.to_tensor(o)) for o in b[x]]
Since with_transform
applies the transform function every new batch, this is good for applying data augmentations or place where we want randomness.
= 1024
bs = dsd.with_transform(transformi) tds
Now we make a Pytorch DataLoaders. We can say how many processors we want to use. We are using 4 here.
= DataLoaders.from_dd(tds, bs, num_workers=4)
dls = dls.train
dt = next(iter(dt))
xb,yb 10] xb.shape,yb[:
(torch.Size([1024, 784]), tensor([2, 6, 7, 4, 9, 5, 3, 5, 6, 7]))
This is the Learner
class. It is not very flexible, but it works.
class Learner:
def __init__(self, model, dls, loss_func, lr, opt_func=optim.SGD): fc.store_attr()
def one_batch(self):
self.xb,self.yb = to_device(self.batch)
self.preds = self.model(self.xb)
self.loss = self.loss_func(self.preds, self.yb)
if self.model.training:
self.loss.backward()
self.opt.step()
self.opt.zero_grad()
with torch.no_grad(): self.calc_stats()
def calc_stats(self):
= (self.preds.argmax(dim=1)==self.yb).float().sum()
acc self.accs.append(acc)
= len(self.xb)
n self.losses.append(self.loss*n)
self.ns.append(n)
def one_epoch(self, train):
self.model.training = train
= self.dls.train if train else self.dls.valid
dl for self.num,self.batch in enumerate(dl): self.one_batch()
= sum(self.ns)
n print(self.epoch, self.model.training, sum(self.losses).item()/n, sum(self.accs).item()/n)
def fit(self, n_epochs):
self.accs,self.losses,self.ns = [],[],[]
self.model.to(def_device)
self.opt = self.opt_func(self.model.parameters(), self.lr)
self.n_epochs = n_epochs
for self.epoch in range(n_epochs):
self.one_epoch(True)
with torch.no_grad(): self.one_epoch(False)
= 28*28,50
m,nh = nn.Sequential(nn.Linear(m,nh), nn.ReLU(), nn.Linear(nh,10)) model
We fit, but this is not very fast.
= Learner(model, dls, F.cross_entropy, lr=0.2)
learn 1)
learn.fit(# Using only 1
0 True 1.1959598958333333 0.6107833333333333
0 False 1.1534678571428572 0.6217571428571429
CPU times: user 5.41 s, sys: 461 ms, total: 5.87 s
Wall time: 7.88 s
= Learner(model, dls, F.cross_entropy, lr=0.2)
learn 1)
learn.fit(# Using 4
0 True 0.7164356770833333 0.7443166666666666
0 False 0.7154278459821428 0.7437571428571429
CPU times: user 4.6 s, sys: 434 ms, total: 5.03 s
Wall time: 7.79 s
Okay. We used 4 processors to train the model here but it is still not very fast. Let’s make it faster!
Faster fit
By using Hugging Face Features
, we can turn images into tensors when we download the data. First, we use load_data_builder
to look at the metadata, such as the features, splits, description of the data, and etc. without actually downloading the data yet.
= load_dataset_builder(name)
builder builder.info.features
{'image': Image(decode=True, id=None),
'label': ClassLabel(names=['T - shirt / top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'], id=None)}
= dsd['train'].features.copy()
dsd_features dsd_features
{'image': Image(decode=True, id=None),
'label': ClassLabel(names=['T - shirt / top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'], id=None)}
from datasets import Features, Array2D
We use Array2D
to turn the images into 2D arrays with a certain shape and dtype. It is a bit weird using Array2D
and shape=[1, 28*28]
instead of something like Array
or Array1D
and shape=[28*28]
. However, Hugging Face does not have that. We can just use map to unsqueeze it. However, this won’t be a problem with colored images.
'image'] = Array2D(shape=[1, 28*28], dtype='float32')
dsd_features[ dsd_features
{'image': Array2D(shape=(1, 784), dtype='float32', id=None),
'label': ClassLabel(names=['T - shirt / top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'], id=None)}
Now we load the dataset using those features, but this is a list
! Why is it not a tensor? We have to set the format to torch
in order to make it as tensor.
= load_dataset(name, features=dsd_features)
dsd type(dsd['train'][0][x])
list
type="torch")
dsd.set_format(type(dsd['train'][0][x])
torch.Tensor
'train'][0][x].shape dsd[
torch.Size([1, 784])
Now, we just need to squeeze each tensor to get rid of useless 1 in the shape.
@inplace
def sq(b): b[x] = [o.squeeze().div(255) for o in b[x]]
Here, we use map
to squeeze them. With batched=True
, it is faster.
= dsd.map(sq, batched=True)
tds 'train'][0][x].shape tds[
torch.Size([784])
torch.tensor
?
So, why didn’t we just use torch.tensor
in the beginning and used Features
and Array2D
? Because Hugging Face converts tensors back to images. Hugging Face uses Apache Arrow, and it does not support tensors are not supported. So data have to be either list or image, and we do not want image.
Now, it is in the right shape. However, the difference is that it does not have to keep converting from image to tensor every batch. With map
, there is no calculation on flight, which is what we want here.
= DataLoaders.from_dd(tds, bs, num_workers=0)
dls = dls.train
dt = next(iter(dt))
xb,yb 10] xb.shape,yb[:
(torch.Size([1024, 784]), tensor([2, 0, 0, 0, 0, 7, 0, 5, 5, 2]))
Now, it is very fast to train even with only one worker.
= Learner(model, dls, F.cross_entropy, lr=0.2)
learn 1) learn.fit(
0 True 0.6185346354166666 0.7802833333333333
0 False 0.6170732700892857 0.7807571428571428
CPU times: user 5.4 s, sys: 225 ms, total: 5.63 s
Wall time: 2.82 s
Conclusion
We used Features
and Array2D
to convert images into tensors for faster training. It was awkward using Array2D
when we want Array1D
, but it was not a problem.