Computer Vision Mobile App — End-to-end AI pipeline demo using Onepanel

The following Steps describes how to use the API's comsumed by the Demo Application. The Code for the App is Hosted here . The whole project can be visualized in the block diagram below : App Flow Diagram


  1. Resume dataset-upload-api and run to start the API .

The basic idea of file uploads is actually quite simple. It basically works like this:

A <form> tag is marked with enctype=multipart/form-data and an <input type=file> is placed in that form.

The application accesses the file from the files dictionary on the request object.

use the save() method of the file to save the file permanently somewhere on the filesystem.

import os
from flask import Flask, flash, request, redirect, url_for
from werkzeug.utils import secure_filename

UPLOAD_FOLDER = '/path/to/the/uploads'
ALLOWED_EXTENSIONS = {'txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'}

app = Flask(__name__)
The werkzeug.secure_filename() is explained a little bit later. The UPLOAD_FOLDER is where we will store the uploaded files and the ALLOWED_EXTENSIONS is the set of allowed file extensions.

Next the functions that check if an extension is valid and that uploads the file and redirects the user to the URL for the uploaded file:

def allowed_file(filename):
    return '.' in filename and \
           filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS

@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        # check if the post request has the file part
        if 'file' not in request.files:
            flash('No file part')
            return redirect(request.url)
        file = request.files['file']
        # if user does not select file, browser also
        # submit an empty part without filename
        if file.filename == '':
            flash('No selected file')
            return redirect(request.url)
        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)
  ['UPLOAD_FOLDER'], filename))
            return redirect(url_for('uploaded_file',
    return '''
    <!doctype html>
    <title>Upload new File</title>
    <h1>Upload new File</h1>
    <form method=post enctype=multipart/form-data>
      <input type=file name=file>
      <input type=submit value=Upload>


Once Data is Upload is in the workspace, create a Datset and Pull it into CVAT Workspace and start Annotating. Dump the annotations into a CVAT XML . Given a CVAT XML and a directory with the image dataset, this script reads the CVAT XML and writes the annotations in tfrecords format into a given directory in addition to the label map required for the tensorflow object detection API. Install necessary packages (including tensorflow).

sudo apt-get update
sudo apt-get install -y --no-install-recommends python3-pip python3-dev
pip3 install -r requirements.txt

2. Install the tensorflow object detection API

If it's already installed you can check your $PYTHONPATHand move on to the usage section. Here's a quick (unofficial) guide on how to do that. For more details follow the official guide INSTALL TENSORFLOW OBJECT DETECTION API.

# clone the models repository
git clone
# install some dependencies
pip3 install --user Cython
pip3 install --user contextlib2
pip3 install --user pillow
pip3 install --user lxml
pip3 install --user jupyter
pip3 install --user matplotlib
# clone and compile the cocoapi
git clone
cd cocoapi/PythonAPI
cp -r pycocotools <path_to_models_repo>/models/research/
# Protobuf Compilation
cd <path_to_models_repo>/models/research/
protoc object_detection/protos/*.proto --python_out=.
# setup the PYTHONPATH
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim

## Usage

Run the script.

$python3 --cvat-xml </path/to/cvat/xml> --image-dir </path/to/images>\
  --output-dir </path/to/output/directory> --attribute <attribute>

Leave --attribute argument empty if you want the to consider CVAT labels as tfrecords labels, otherwise you can specify a used attribute name like --attribute <attribute>.

Please run python --help for more details.

Once Data is Annotated and Converted to tfrecords, use the Following to train a model using it

$python /onepanel/code/models/research/object_detection/legacy/ \
--train_dir=/onepanel/code/Custom-Mask-RCNN-using-Tensorfow-Object-detection-API/CP \

Start the Inference API

Once we’ve retrained our model and exported it to disk, we can host the model as a service. We’ll load the model from disk with a simple function that takes the graph definition directly from the file and uses that to generate a graph. TensorFlow does most of this for us, Resume dataset-upload-api and run to start the API .

detection_graph = tf.Graph()
with detection_graph.as_default():
  od_graph_def = tf.GraphDef()
  with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
    serialized_graph =
    tf.import_graph_def(od_graph_def, name='')
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape(
      (im_height, im_width, 3)).astype(np.uint8)
Using Flask, much of the heavy-lifting around configuring a server and handling requests is done for us. After we’ve created a Flask app object:
app = Flask(__name__)
Then, we can easily create routes for where our classification service will live. Let’s create a default route to our classify() function that will allow us to pass an image to the endpoint for identification.
def uploaded_file(filename):
    TEST_IMAGE_PATHS = [ os.path.join(PATH_TO_TEST_IMAGES_DIR,filename.format(i)) for i in range(1, 2) ]
    IMAGE_SIZE = (12, 8)

    with detection_graph.as_default():
        with tf.Session(graph=detection_graph) as sess:
            for image_path in TEST_IMAGE_PATHS:
                image =
                image_np = load_image_into_numpy_array(image)
                image_np_expanded = np.expand_dims(image_np, axis=0)
                image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
                boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
                scores = detection_graph.get_tensor_by_name('detection_scores:0')
                classes = detection_graph.get_tensor_by_name('detection_classes:0')
                num_detections = detection_graph.get_tensor_by_name('num_detections:0')
                (boxes, scores, classes, num_detections) =
                    [boxes, scores, classes, num_detections],
                    feed_dict={image_tensor: image_np_expanded})
                im = Image.fromarray(image_np)

    return send_from_directory(app.config['UPLOAD_FOLDER'],
Using the decorator syntax to define the route, it will configure the service so that our uploaded_file() function will be called every time someone hits the root of our service address. We said we wanted users to be able to specify a file to be identified so we’ll store that as a parameter from the request:
file_name = request.args['file']
In an actual app, wed probably populate this from a form attachment or URL. For our example, well simply let users specify a path to the file that they want to be identified.

We can then read the image file and turn it into a tensor to be used as input to the graph we loaded previously. The base script included a number of useful functions including read_tensor_from_image_file() which will take the image file and turn it into a tensor to use as input by using a small custom TensorFlow graph.

Running the inference on our graph with this image is again quite straightforward:
       (boxes, scores, classes, num_detections) =
                    [boxes, scores, classes, num_detections],
                    feed_dict={image_tensor: image_np_expanded})
In this line, the variable t represents the image tensor that was created by read_tensor_from_image_file() function. TensorFlow will then take that image and run the new retrained model to generate predictions.

Those predictions come as a series of probabilities that indicate which of the classes (poodle, pug, or wiener dog) is the most likely. Since this is just a prediction service, it will simply return a JSON representation of the arrays.

Inside our script we can start our service with:
```python, port=5000)
Then, if we want to launch the script from the command line, all we have to do is run python and it will initialize and start running on port 5000.

Using the Service

We can now use this service either by visiting it in a web browser or generally making any REST call on that port. For an easy test we can access it using following code snippet:

import os
import requests
url = ''
with open(path_img, 'rb') as img:
  name_img= os.path.basename(path_img)
  files= {'file': (name_img,img,'multipart/form-data',{'Expires': '0'}) }
  with requests.Session() as s:
    r =,files=files)

Download APP

Download the Android app by scanning the QR Android App or click on this Link