How to build an image dataset

Stavros Niafas
3 min readApr 10, 2018

--

When I was given the subject of my master thesis, I had no idea how painful could it be to build totally from scratch a CIBR application from hand crafted image features. The goal was to tackle the problem of building recognition, i.e. given a query image of a specific building, retrieve images of the same building within a database.

Collage from random buildings

But let me introduce you to the purpose of the post. I am writing this post in order to point out the difficulties and way to tackle them, as I managed to tackle them, in case someone is keen on publish a post-modern radical dataset and shake the Kaggle community. Just kidding.

Why I managed to start from scratch

Obviously, I could easily use one of already famous building datasets like Oxford (actually I used it to contaminate mine), but they lacked of specific structure, random naming and building classification, elements that it could confuse the evaluation methodology and experimental results.

Then I geared up with extra patience and starting to photo shoot.

The Vyronas Dataset

The proposed dataset is not that large but it is fine structured and named. It comprises from 900 photos taken from 60 buildings in the vicinity of Vyronas, Athens, Greece. The dataset consists of urban buildings with a variety of architectural specifications, number of floors, construction age, colors, etc. Each building has a series of 15 photos, under 5 viewpoints and 3 illumination conditions. It took me almost 3 months, after exhaustive filtering and screening the final buildings. In reality I captured over 100 buildings, ended up of 1200+ photos, but in order to hit all the 3 illumination conditions in a row is totally impossible.

Organising and Logging

First things first. Capturing photos of residential buildings makes you feel like a spy. In order to keep track of the captured buildings a spreadsheet is a must. Walking through the streets, I should always right down the address of each building as well the illumination condition that was exposed. Single viewpoint captures is not advised as you should have a multiple choice for the best capture when comes the filtering.

After each photo session, I have to obviously parse all the photos one by one, rename them in a eye-friendly name like building01_left_day, not a necessity though, and update the track of my spreadsheet, e.g. new buildings, update the existing, mark the remaining captures etc.

I came up with the aforementioned route over successively mistakes, logging confusions and sweaty t-shirts (Athens is a sunny city).

Dataset is only the beginning

Indeed, but a good start is half the work. Data quality insures the performance of the project, whatever project is meant to be. Data nature drives the possible business value.

Although I didn’t, and still, intend to up scale dataset’s perspective. It was just an academic project which aimed to introduce me in data analysis through computer vision practices with a glimpse to web technologies.

Conclusion

If you ask me to did it again or enhance the dataset; that’s a brave NO. Motivation was(is) the cogwheel behind the aspect of learning and achieving new things. May this first post proves promising.

Maybe in future posts, I shall continue with the development methodology I followed on how I managed to factor the dataset and make it available to construct the application. Until then you may have a sneak peak in the landing page of RetBul application and Vyronas dataset.

--

--

Stavros Niafas
Stavros Niafas

Written by Stavros Niafas

ML engineer interested in the broader domain of AI, focused in data-centric AI techniques, active learning, MLOps and computer vision.

No responses yet