* adding gallery * added netlify files * new navbar design * header section new design * used by section new design * cards section new design * integrates with section new design * customer stories section new design * footer and gradient * demos section new design * docs fixes * docs reorg * docs reorg * upgrading to tailwind 3 * tailwind config changes * navbar new design * fixing body on all pages * Updating Guides (#1012) * updating getting started * updated codecov version * tweaks to gs * added netlify file * added netlify file * website prebuild script * increased code size * blocks * edits * blocks_hello * hello world * guide * merge main * added flipper demo * guide * guide * add guides * tweak to refresh website * header section new design * demos section new design * cards design * used by section * tweets section * footer on all pages * mobile responsive fixes * mobile responsive fixes * https fonts * completed blocks guide * unify components * minor tweaks * docs headers styling and navigation pane * parameter code blocks * styling description and input type * parameter tables and other styling * only documenting interactive components when possible * guides * embedding not working * demos not working * fixing demo code * fixing demos * demo fix * updated demos * updated demos * ui update * updated docstrings * updated code snippets so they run * updating docs * Interface docs * updating interface * fixing parameters in interface.py * required and defaults for interface, and styling * fixing up interface (#1207) * fixing up interface * fixed interface methods * formatting * updating interface docs * updating interface docs * formatting * docstring to load from docstrings * fixed colors * finalize interface content * interface examples * fixed examples * added some blocks docs * blocks * component fixes * reorganized some files (#1210) * formatting * added info for every component * fixes * blocks docs * added blocks demos * adding combined interfaces * added parallel, series * Doc: layout update (#1216) * doc layout * home spacing Co-authored-by: Abubakar Abid <abubakar@huggingface.co> * adding layouts * layouts done * added events for components * formatting and https * brings back dropdown and other components * fix header ids * moved ids and fixed nav * added parameters for remaining component * docstring fixes * landing page demos * demo window placeholder * demo nav * fixed test * formatting * demo code * correctly importing gradio css/js * remove keyvalues * modify launch script to move gradio assetS * components embedded test * correct demo name * hide try demo and embedding * local devserver changes * create embedding json with configs * changes * fixes * comment out layout docs * demo work * demo fixes * demo embedding fixes * typo * jinja fix * demo nav fix * hide demo button * formatting * removed whitespace * remove newline from parameter * reverting comments Co-authored-by: aliabd <ali.si3luwa@gmail.com> Co-authored-by: Victor Muštar <victor.mustar@gmail.com> Co-authored-by: Ali Abid <aabid94@gmail.com>
5.5 KiB
Building a Pictionary App
related_spaces: https://huggingface.co/spaces/nateraw/quickdraw tags: SKETCHPAD, LABELS, LIVE Docs: image, label
Introduction
How well can an algorithm guess what you're drawing? A few years ago, Google released the Quick Draw dataset, which contains drawings made by humans of a variety of every objects. Researchers have used this dataset to train models to guess Pictionary-style drawings.
Such models are perfect to use with Gradio's sketchpad input, so in this tutorial we will build a Pictionary web application using Gradio. We will be able to build the whole web application in Python, and will look like this (try drawing something!):
Let's get started!
Prerequisites
Make sure you have the gradio
Python package already installed. To use the pretrained sketchpad model, also install torch
.
Step 1 — Setting up the Sketch Recognition Model
First, you will need a sketch recognition model. Since many researchers have already trained their own models on the Quick Draw dataset, we will use a pretrained model in this tutorial. Our model is a light 1.5 MB model trained by Nate Raw, that you can download here.
If you are interested, here is the code that was used to train the model. We will simply load the pretrained model in PyTorch, as follows:
import torch
from torch import nn
model = nn.Sequential(
nn.Conv2d(1, 32, 3, padding='same'),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Conv2d(32, 64, 3, padding='same'),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Conv2d(64, 128, 3, padding='same'),
nn.ReLU(),
nn.MaxPool2d(2),
nn.Flatten(),
nn.Linear(1152, 256),
nn.ReLU(),
nn.Linear(256, len(LABELS)),
)
state_dict = torch.load('pytorch_model.bin', map_location='cpu')
model.load_state_dict(state_dict, strict=False)
model.eval()
Step 2 — Defining a predict
function
Next, you will need to define a function that takes in the user input, which in this case is a sketched image, and returns the prediction. The prediction should be returned as a dictionary whose keys are class name and values are confidence probabilities. We will load the class names from this text file.
In the case of our pretrained model, it will look like this:
from pathlib import Path
LABELS = Path('class_names.txt').read_text().splitlines()
def predict(img):
x = torch.tensor(img, dtype=torch.float32).unsqueeze(0).unsqueeze(0) / 255.
with torch.no_grad():
out = model(x)
probabilities = torch.nn.functional.softmax(out[0], dim=0)
values, indices = torch.topk(probabilities, 5)
confidences = {LABELS[i]: v.item() for i, v in zip(indices, values)}
return confidences
Let's break this down. The function takes one parameters:
img
: the input image as anumpy
array
Then, the function converts the image to a PyTorch tensor
, passes it through the model, and returns:
confidences
: the top five predictions, as a dictionary whose keys are class labels and whose values are confidence probabilities
Step 3 — Creating a Gradio Interface
Now that we have our predictive function set up, we can create a Gradio Interface around it.
In this case, the input component is a sketchpad. To create a sketchpad input, we can use the convenient string shortcut, "sketchpad"
which creates a canvas for a user to draw on and handles the preprocessing to convert that to a numpy array.
The output component will be a "label"
, which displays the top labels in a nice form.
Finally, we'll add one more parameter, setting live=True
, which allows our interface to run in real time, adjusting its predictions every time a user draws on the sketchpad. The code for Gradio looks like this:
import gradio as gr
gr.Interface(fn=predict,
inputs="sketchpad",
outputs="label",
live=True).launch()
This produces the following interface, which you can try right here in your browser (try drawing something, like a "snake" or a "laptop"):
And you're done! That's all the code you need to build a Pictionary-style guessing app. Have fun and try to find some edge cases 🧐