gradio/guides/building_a_pictionary_app.md
aliabid94 70ebf698fa
Live website changes (#1578)
* fix audio output cache (#804)

* fix audio output cache

* changes

* version update

Co-authored-by: Ali Abid <aliabid94@gmail.com>

* Website Tracker Slackbot (#797)

* added commands to reload script

* catch errors with git pull

* read new webhook from os variable

* correcting bash

* bash fixes

* formatting

* more robust error checking

* only sends success if git changes

* catching error from script

* escaping error text to send with curl

* correct text escaping for error message

* fix search bug in guides (#809)

* Update getting_started.md (#808)

* Fix type of server returned by `Launchable` (#810)

* `Launchable` returns a FastAPI now

* Update .gitignore

* Add a missing line to getting started (#816)



Former-commit-id: 81e271ca22 [formerly 96f203108b]
Former-commit-id: eaff13262853078e0c6c0baa54c731d9e56bc73f

* Add a missing line to getting started (#816)



Former-commit-id: 81e271ca22 [formerly 81e271ca22 [formerly 96f203108b]]
Former-commit-id: eaff13262853078e0c6c0baa54c731d9e56bc73f
Former-commit-id: b5112c3f42

* Add a missing line to getting started (#816)



Former-commit-id: 81e271ca22 [formerly 81e271ca22 [formerly 81e271ca22 [formerly 96f203108b]]]
Former-commit-id: eaff13262853078e0c6c0baa54c731d9e56bc73f
Former-commit-id: b5112c3f42
Former-commit-id: bce6f9c4c5

* Add a missing line to getting started (#816)



Former-commit-id: 81e271ca22 [formerly 81e271ca22 [formerly 81e271ca22 [formerly 81e271ca22 [formerly 96f203108b]]]]
Former-commit-id: eaff13262853078e0c6c0baa54c731d9e56bc73f
Former-commit-id: b5112c3f42
Former-commit-id: bce6f9c4c5
Former-commit-id: feba0888e3

* Add a missing line to getting started (#816)

* Clean-History
- Remove 51MB file with this commit


Former-commit-id: 34b6a2325d613eeef622410f2d1ff3d869d3133c

* Clean-History
- Remove 51MB file with this commit


Former-commit-id: 34b6a2325d613eeef622410f2d1ff3d869d3133c
Former-commit-id: dd700c33cc

* Clean-History
- Remove 51MB file with this commit


Former-commit-id: 34b6a2325d613eeef622410f2d1ff3d869d3133c
Former-commit-id: dd700c33cc
Former-commit-id: 0d80e6a056

* Clean-History
- Remove 51MB file with this commit


Former-commit-id: 34b6a2325d613eeef622410f2d1ff3d869d3133c
Former-commit-id: dd700c33cc
Former-commit-id: 0d80e6a056
Former-commit-id: 20523b0519

* changes

* changes

* Homepage: header image size (#1347)

* image size

* image in local assets

* add dall-e mini banner

* undo ui changes

* changes

* changes

* updates

* updates

* changes

* changes

* changes

* h11 dependency

* add npm build-mac

* expand demo button to all classes

* add demos to docstrings

* add anchor tags to headers

* add required tag to param table

* add consistent styling for headers

* skip param beginning with underscore from docs

* skip kwargs param from docs

* remove types in param docstring

* override signature to reflect usage

* add supported events

* add step-by-step guides

* fix guide contribution link

* add related spaces

* fix img styling on guides

* pin quickstart, advanced, and block guides to top

* margin fix

* autogenerated copy buttons for all codeblocks

* changes

* documentaiton

* format

* launch

* formatting

* style changes

* remove backticks

* changes

* changes

Co-authored-by: Ali Abid <aliabid94@gmail.com>
Co-authored-by: Ali Abdalla <ali.si3luwa@gmail.com>
Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Ömer Faruk Özdemir <farukozderim@gmail.com>
Co-authored-by: Ali <ali.abid@huggingface.co>
Co-authored-by: Victor Muštar <victor.mustar@gmail.com>
Co-authored-by: Abubakar Abid <abubakar@huggingface.co>
2022-07-06 16:22:10 -07:00

5.5 KiB

Building a Pictionary App

Related spaces: https://huggingface.co/spaces/nateraw/quickdraw Tags: SKETCHPAD, LABELS, LIVE Docs: image, label

Introduction

How well can an algorithm guess what you're drawing? A few years ago, Google released the Quick Draw dataset, which contains drawings made by humans of a variety of every objects. Researchers have used this dataset to train models to guess Pictionary-style drawings.

Such models are perfect to use with Gradio's sketchpad input, so in this tutorial we will build a Pictionary web application using Gradio. We will be able to build the whole web application in Python, and will look like this (try drawing something!):

Let's get started!

Prerequisites

Make sure you have the gradio Python package already installed. To use the pretrained sketchpad model, also install torch.

Step 1 — Setting up the Sketch Recognition Model

First, you will need a sketch recognition model. Since many researchers have already trained their own models on the Quick Draw dataset, we will use a pretrained model in this tutorial. Our model is a light 1.5 MB model trained by Nate Raw, that you can download here.

If you are interested, here is the code that was used to train the model. We will simply load the pretrained model in PyTorch, as follows:

import torch
from torch import nn

model = nn.Sequential(
    nn.Conv2d(1, 32, 3, padding='same'),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Conv2d(32, 64, 3, padding='same'),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Conv2d(64, 128, 3, padding='same'),
    nn.ReLU(),
    nn.MaxPool2d(2),
    nn.Flatten(),
    nn.Linear(1152, 256),
    nn.ReLU(),
    nn.Linear(256, len(LABELS)),
)
state_dict = torch.load('pytorch_model.bin',    map_location='cpu')
model.load_state_dict(state_dict, strict=False)
model.eval()

Step 2 — Defining a predict function

Next, you will need to define a function that takes in the user input, which in this case is a sketched image, and returns the prediction. The prediction should be returned as a dictionary whose keys are class name and values are confidence probabilities. We will load the class names from this text file.

In the case of our pretrained model, it will look like this:

from pathlib import Path

LABELS = Path('class_names.txt').read_text().splitlines()

def predict(img):
    x = torch.tensor(img, dtype=torch.float32).unsqueeze(0).unsqueeze(0) / 255.
    with torch.no_grad():
        out = model(x)
    probabilities = torch.nn.functional.softmax(out[0], dim=0)
    values, indices = torch.topk(probabilities, 5)
    confidences = {LABELS[i]: v.item() for i, v in zip(indices, values)}
    return confidences

Let's break this down. The function takes one parameters:

  • img: the input image as a numpy array

Then, the function converts the image to a PyTorch tensor, passes it through the model, and returns:

  • confidences: the top five predictions, as a dictionary whose keys are class labels and whose values are confidence probabilities

Step 3 — Creating a Gradio Interface

Now that we have our predictive function set up, we can create a Gradio Interface around it.

In this case, the input component is a sketchpad. To create a sketchpad input, we can use the convenient string shortcut, "sketchpad" which creates a canvas for a user to draw on and handles the preprocessing to convert that to a numpy array.

The output component will be a "label", which displays the top labels in a nice form.

Finally, we'll add one more parameter, setting live=True, which allows our interface to run in real time, adjusting its predictions every time a user draws on the sketchpad. The code for Gradio looks like this:

import gradio as gr

gr.Interface(fn=predict, 
             inputs="sketchpad",
             outputs="label",
             live=True).launch()

This produces the following interface, which you can try right here in your browser (try drawing something, like a "snake" or a "laptop"):


And you're done! That's all the code you need to build a Pictionary-style guessing app. Have fun and try to find some edge cases 🧐