Image Labeling — Take a Part in Machine Learning

Andhika S Pratama
Data Folks Indonesia
6 min readSep 21, 2020

--

Example of an annotated image of confused Nick Young

Image Labeling is one of the tasks mostly given to me as a Data Annotator. This task of image labeling, unfortunately, is still unknown to some and only limited to those who understand or have the knowledge about machine learning and computer vision.

I’ll try to elaborate more on what is image labeling, what is it for, and how to do it based on my experience and what I’ve read so far regarding image labeling. Hopefully, this will bring image labeling to a wider audience.

So… let’s get started!

What is Image Labeling?

Image Labeling is the process of drawing a tight shape around an image with a label or labels. There is no limit to what entities you can give a label to, as long as the entities in the image are as clear as day.

Image Labeling is important in Supervised Machine Learning because the annotated data will be used to train the model so that it could learn, and give results based on the quality of the data given.

There is also this common phrase of “Garbage in Garbage Out” in Machine Learning which means the quality of the results is determined by the quality of the data input.

You need to be consistent in data annotation because when you give an apple with an orange label, the next time the model sees an apple, it will recognize the apple as an orange.

Consistency is key in data annotation

How to Do Image Labeling?

Before we can go straight to image labeling, here are some preparations that we need to do:

1.Data

It is very important in image annotation that the data provided is in image form (of course) with the best quality possible (not blurry or in low res). Also, it is recommended to provide the images in .jpg form because some tools just do not support other image formats such as .png, or .raw.

2.Tools

top 4 annotation tools in the dataset.com

Some companies as far as I know tend to have their own labeling tools for their data annotations. There are also AI companies that provide data annotation services for large scale data such as Scale AI, and Nodeflux.

To make it clear… I don’t know how these companies’ annotation tools work because I only use free tools such as labelimg, labelme, and many available tools that are listed in the dataset.com.

Different tools also have different advantages and disadvantages. It ranges from what types of available shapes to use (bounding box, polygon, points, etc.) to what kind of formats (import export) it supports. Make sure you understand your annotation needs before choosing the right tools!

Comparison table of different annotation tools (Source)

Now that we have prepared everything, let’s proceed to image labeling!

Basically, as was stated in the beginning, all you have to do in image labeling is to “draw a tight shape” around the entity that you want to give label to.

There are many shape variations that the tools can offer so that the quality of the annotated images can be achieved in a plus ultra! manner.

So… here are the two shapes that I used the most:

Rectangle Shape (Bounding Box)

an example of bounding boxes. taken with labelimg tool

In this case, I used bounding boxes for the purpose of object detection in warung and also for OCR (Optical Character Recognition).

Well, the purpose of bounding box is mostly for object detection and localization tasks. Bounding box contains coordinates such as xmin, ymin, xmax, and ymax.

Polygon Shape

an example of polygon shape, taken with labelme tool.

Sometimes the items in warung are just not in rectangle shapes. To annotate these kinds of items, polygon shape is used to cover the shape and the location of the items in a more precise way.

There are more shapes in Image labeling, but I haven’t got the chance to use it in my recent project. Still, I will mention it anyway and give an explanation based on what I’ve read.

> 3D Cuboids

An example of 3D Cuboids. Source: anolytics.ai

3D cuboids are like the 3D version of the bounding box. It has more depth information to the object labeled.

From what I’ve read this type of annotation shape is used for self-driving cars.

> Semantic Segmentation

An example of semantic segmentation. Source: scale.com

This image has some classes attached to different entities. The semantic segmentation is said to be used mostly for self-driving cars to acknowledge the environment these self-driving cars are operating in.

> Lines and Splines

An example of Lines and Splines. Source: cogitotech.com

Lines and splines use cases are various. It can be used in autonomous vehicles for lanes and boundary detection and recognition, it can also be used for drones, and for robotics.

As the name and the example suggest, it is the labeling of straight lines and curved lines.

> Key-Point and Landmark

An example of Key-Points and Landmark Annotation. Source: Awakening Vector

Key-Point and Landmark shapes are mostly used for facial features detection, facial expression, emotions, and many more. The key-point and landmark shapes are created by putting the dots around the entity labeled.

3.Data Annotation Save Format

There are some formats in saving the annotated data, but I personally use only two formats for image detection: Pascal VOC, and JSON.

JSON

Labelme supports only JSON format and this is the example of JSON format file:

{
"version": "4.2.10",
"flags": {},
"shapes": [
{
"label": "Banner Surya Pro Mild",
"points": [
[
0.31578947368416266,
305.2631578947368
],
[
405.57894736842104,
450.0
]
],
"group_id": null,
"shape_type": "rectangle",
"flags": {}
},
{
"label": "Banner Gudang Garam",
"points": [
[
406.8947368421052,
286.8421052631579
],
[
866.1052631578948,
248.68421052631578
],
[
884.5263157894738,
406.57894736842104
],
[
625.3157894736842,
432.89473684210526
],
[
408.2105263157895,
447.36842105263156
]
],
"group_id": null,
"shape_type": "polygon",
"flags": {}
}

Pascal VOC

Pascal VOC is supported by some tools such as labelimg. Pascal VOC is stored as an XML file and here is the example of Pascal VOC for image detection:

<annotation>
<folder>Warpin GG.Raden 1</folder>
<filename>02_depan_dekat_normal.jpg</filename>
<path>C:\Warung Pintar\Photos of Product at Warung\Warung Pintar\Warpin GG.Raden 1\02_depan_dekat_normal.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>3088</width>
<height>3088</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>FAN001, Botol</name>
<pose>Unspecified</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>1</xmin>
<ymin>1758</ymin>
<xmax>168</xmax>
<ymax>2235</ymax>
</bndbox>
</object>
<object>
<name>TEH010, Botol</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>141</xmin>
<ymin>1804</ymin>
<xmax>284</xmax>
<ymax>2236</ymax>
</bndbox>
</object>

There are many other formats supported by different tools, you just have to know when or why to use one.

Aaaand that’s it! those are the steps on how to do image labeling. Hope you guys could try image labeling at some point!

For as long as my time in Warung Pintar, I can’t code, I am not an engineer, and I haven’t got the time to learn to code. But, those limitations don’t stop me from taking my part in machine learning development as a data annotator and labeling those data for supervised machine learning.

--

--

Andhika S Pratama
Data Folks Indonesia

Hi there! Currently, I’m a Data Annotator in Tictag.io who have an interest in writing such as Copywriting, and UX Writing.