from miniai.datasets import *
from miniai.conv import *
from miniai.conv import *Hugging Face Features

In lesson 15 of the Practical Deep Learning For Coders, we used Hugging Face Datasets to download Fasion MNIST data and trained our model. I faced a problem here because I could not fit my model fast enough. Even using all my CPU cores, it was not as fast as Jeremy’s computer. Even on Google Colab, CPU was not strong. It was not a huge problem, but it was annoying, so I decided to find a solution.
In the lesson, we downloaded images and applied a transform function to convert them into tensors. With dsd.with_transform, the transform happened every batch and it took the most of the time. We don’t have to apply transform every batch. So, let’s find a way to only do it once.
Initially, I just wanted to convert images into tensors with map, but Hugging Face used Apache Arrow, which does not support tensors type. So, I used Hugging Face Features and Array2D to fix this problem.
Here is the orginal approach that takes a long time from the course.
Original approach
from datasets import load_dataset,load_dataset_builder
import torch
import torchvision.transforms.functional as TF
from torch import optim, nn,tensor
import torch.nn.functional as F
import fastcore.all as fcimport logging
logging.disable(logging.WARNING)
from tqdm import tqdmFirst, we grab the data from Hugging Face.
x,y = 'image','label'
name = "fashion_mnist"
dsd = load_dataset(name)Here is a inplace transform function. This function is applied every batch and converts images into tensors with the right shape.
@inplace
def transformi(b): b[x] = [torch.flatten(TF.to_tensor(o)) for o in b[x]]Since with_transform applies the transform function every new batch, this is good for applying data augmentations or place where we want randomness.
bs = 1024
tds = dsd.with_transform(transformi)Now we make a Pytorch DataLoaders. We can say how many processors we want to use. We are using 4 here.
dls = DataLoaders.from_dd(tds, bs, num_workers=4)
dt = dls.train
xb,yb = next(iter(dt))
xb.shape,yb[:10](torch.Size([1024, 784]), tensor([2, 6, 7, 4, 9, 5, 3, 5, 6, 7]))
This is the Learner class. It is not very flexible, but it works.
class Learner:
def __init__(self, model, dls, loss_func, lr, opt_func=optim.SGD): fc.store_attr()
def one_batch(self):
self.xb,self.yb = to_device(self.batch)
self.preds = self.model(self.xb)
self.loss = self.loss_func(self.preds, self.yb)
if self.model.training:
self.loss.backward()
self.opt.step()
self.opt.zero_grad()
with torch.no_grad(): self.calc_stats()
def calc_stats(self):
acc = (self.preds.argmax(dim=1)==self.yb).float().sum()
self.accs.append(acc)
n = len(self.xb)
self.losses.append(self.loss*n)
self.ns.append(n)
def one_epoch(self, train):
self.model.training = train
dl = self.dls.train if train else self.dls.valid
for self.num,self.batch in enumerate(dl): self.one_batch()
n = sum(self.ns)
print(self.epoch, self.model.training, sum(self.losses).item()/n, sum(self.accs).item()/n)
def fit(self, n_epochs):
self.accs,self.losses,self.ns = [],[],[]
self.model.to(def_device)
self.opt = self.opt_func(self.model.parameters(), self.lr)
self.n_epochs = n_epochs
for self.epoch in range(n_epochs):
self.one_epoch(True)
with torch.no_grad(): self.one_epoch(False)m,nh = 28*28,50
model = nn.Sequential(nn.Linear(m,nh), nn.ReLU(), nn.Linear(nh,10))We fit, but this is not very fast.
learn = Learner(model, dls, F.cross_entropy, lr=0.2)
learn.fit(1)
# Using only 10 True 1.1959598958333333 0.6107833333333333
0 False 1.1534678571428572 0.6217571428571429
CPU times: user 5.41 s, sys: 461 ms, total: 5.87 s
Wall time: 7.88 s
learn = Learner(model, dls, F.cross_entropy, lr=0.2)
learn.fit(1)
# Using 40 True 0.7164356770833333 0.7443166666666666
0 False 0.7154278459821428 0.7437571428571429
CPU times: user 4.6 s, sys: 434 ms, total: 5.03 s
Wall time: 7.79 s
Okay. We used 4 processors to train the model here but it is still not very fast. Let’s make it faster!
Faster fit
By using Hugging Face Features, we can turn images into tensors when we download the data. First, we use load_data_builder to look at the metadata, such as the features, splits, description of the data, and etc. without actually downloading the data yet.
builder = load_dataset_builder(name)
builder.info.features{'image': Image(decode=True, id=None),
'label': ClassLabel(names=['T - shirt / top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'], id=None)}
dsd_features = dsd['train'].features.copy()
dsd_features{'image': Image(decode=True, id=None),
'label': ClassLabel(names=['T - shirt / top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'], id=None)}
from datasets import Features, Array2DWe use Array2D to turn the images into 2D arrays with a certain shape and dtype. It is a bit weird using Array2D and shape=[1, 28*28] instead of something like Array or Array1D and shape=[28*28]. However, Hugging Face does not have that. We can just use map to unsqueeze it. However, this won’t be a problem with colored images.
dsd_features['image'] = Array2D(shape=[1, 28*28], dtype='float32')
dsd_features{'image': Array2D(shape=(1, 784), dtype='float32', id=None),
'label': ClassLabel(names=['T - shirt / top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot'], id=None)}
Now we load the dataset using those features, but this is a list! Why is it not a tensor? We have to set the format to torch in order to make it as tensor.
dsd = load_dataset(name, features=dsd_features)
type(dsd['train'][0][x])list
dsd.set_format(type="torch")
type(dsd['train'][0][x])torch.Tensor
dsd['train'][0][x].shapetorch.Size([1, 784])
Now, we just need to squeeze each tensor to get rid of useless 1 in the shape.
@inplace
def sq(b): b[x] = [o.squeeze().div(255) for o in b[x]]Here, we use map to squeeze them. With batched=True, it is faster.
tds = dsd.map(sq, batched=True)
tds['train'][0][x].shapetorch.Size([784])
torch.tensor?
So, why didn’t we just use torch.tensor in the beginning and used Features and Array2D? Because Hugging Face converts tensors back to images. Hugging Face uses Apache Arrow, and it does not support tensors are not supported. So data have to be either list or image, and we do not want image.
Now, it is in the right shape. However, the difference is that it does not have to keep converting from image to tensor every batch. With map, there is no calculation on flight, which is what we want here.
dls = DataLoaders.from_dd(tds, bs, num_workers=0)
dt = dls.train
xb,yb = next(iter(dt))
xb.shape,yb[:10](torch.Size([1024, 784]), tensor([2, 0, 0, 0, 0, 7, 0, 5, 5, 2]))
Now, it is very fast to train even with only one worker.
learn = Learner(model, dls, F.cross_entropy, lr=0.2)
learn.fit(1)0 True 0.6185346354166666 0.7802833333333333
0 False 0.6170732700892857 0.7807571428571428
CPU times: user 5.4 s, sys: 225 ms, total: 5.63 s
Wall time: 2.82 s
Conclusion
We used Features and Array2D to convert images into tensors for faster training. It was awkward using Array2D when we want Array1D, but it was not a problem.