How to build an image dataset
When I was given the subject of my master thesis, I had no idea how painful could it be to build totally from scratch a CIBR application from hand crafted image features. The goal was to tackle the problem of building recognition, i.e. given a query image of a specific building, retrieve images of the same building within a database.
But let me introduce you to the purpose of the post. I am writing this post in order to point out the difficulties and way to tackle them, as I managed to tackle them, in case someone is keen on publish a post-modern radical dataset and shake the Kaggle community. Just kidding.
Why I managed to start from scratch
Obviously, I could easily use one of already famous building datasets like Oxford (actually I used it to contaminate mine), but they lacked of specific structure, random naming and building classification, elements that it could confuse the evaluation methodology and experimental results.
Then I geared up with extra patience and starting to photo shoot.
The Vyronas Dataset
The proposed dataset is not that large but it is fine structured and named. It comprises from 900 photos taken from 60 buildings in the vicinity of Vyronas, Athens, Greece. The dataset consists of urban buildings with a variety of architectural specifications, number of floors, construction age, colors, etc. Each building has a series of 15 photos, under 5 viewpoints and 3 illumination conditions. It took me almost 3 months, after exhaustive filtering and screening the final buildings. In reality I captured over 100 buildings, ended up of 1200+ photos, but in order to hit all the 3 illumination conditions in a row is totally impossible.
Organising and Logging
First things first. Capturing photos of residential buildings makes you feel like a spy. In order to keep track of the captured buildings a spreadsheet is a must. Walking through the streets, I should always right down the address of each building as well the illumination condition that was exposed. Single viewpoint captures is not advised as you should have a multiple choice for the best capture when comes the filtering.
After each photo session, I have to obviously parse all the photos one by one, rename them in a eye-friendly name like building01_left_day, not a necessity though, and update the track of my spreadsheet, e.g. new buildings, update the existing, mark the remaining captures etc.
I came up with the aforementioned route over successively mistakes, logging confusions and sweaty t-shirts (Athens is a sunny city).
Dataset is only the beginning
Indeed, but a good start is half the work. Data quality insures the performance of the project, whatever project is meant to be. Data nature drives the possible business value.
Although I didn’t, and still, intend to up scale dataset’s perspective. It was just an academic project which aimed to introduce me in data analysis through computer vision practices with a glimpse to web technologies.
Conclusion
If you ask me to did it again or enhance the dataset; that’s a brave NO. Motivation was(is) the cogwheel behind the aspect of learning and achieving new things. May this first post proves promising.