Starting from:

$30

EE599-Homework 4 Dataset Description and Category Classification Solved

1           Dataset Description
Polyvore Outfits [1] is a real-world dataset created based on users’ preferences of outfit configurations on an online website named polyvore.com: items within the outfits that receive high-ratings are considered compatible and vice versa. It contains a total of 365,054 items and 68,306 outfits. The maximum number of items per outfit is 19. A visualization of an outfit is shown in Figure 1.



Figure 1: A visualization of a partial outfit in the dataset. The number at the bottom of each image is the ID of this item.

2           Category Classification
•    The starter code provides the following files with blanks in them and can be read in this order. First, you need to set your dataset location in utils.py (Config[’root path’]).

1.   train category.pytrain compat.py: training scripts

2.   model.py: CNN classification models

3.   data.py: dataset preparation

4.   utils.py: utility functions and config

•    Training takes place in train model function of train *.py. In each iteration, the model takes in batches of data provided by the dataloader (data.py). Record your training acc progress here, which will be used for plotting the learning curve.

•    You’re expected to do finetuning and training from scratch.

1.   Finetune a model pretrained on ImageNet (e.g, ResNet50). Frameworks nowadaysprovide easy access to those, refer to documentations online.

2.   Construct a model of your own and start training from scratch.

3.   Compare these two models and record the results. What is the advantage of usinga finetuned model? What’s the difference between the learning rates when you apply these two learning strategies (i.e finetuning vs from scratch)?

Note: images are located in images folder, each image is named by its id. Information for the item are stored in polyvore item metadata.json.

•    Modify data.py to create data pairs (image, category label). Normalization is defined in get data transforms function.

•    Split no less than 10% data for validating your final model. The test set is test category hw.txt.

•    Tips:

1.   Over-fitting is expected. Play with model structure and hyper-parameter or regularization to reduce over-fitting. You can design any model structure you like.

2.   To speed up the training speed, you can set “use cuda” flag to true and increase the batch size defined in utils.py.

3.   You can restrict the size of the dataset for quick debugging. Set debug=True inutils.py. You do not necessarily need to use 20 epochs and the entire training set.

4.   It may take more than several epochs before the performance plateaus, dependingon the network structure you use and the learning rate.

3           Pairwise Compatibility Prediction
The task is to predict the compatibility of an outfit (Figure 2). It’s essentially a binary classification problem (compatible or incompatible), however, the difficulty of the task lies in the input–classify based on a set rather than a single item as you did in the last section. One idea to deal with set classification is to decompose it into pairwise predictions (you’re encouraged to propose different ideas for the bonus section). Therefore, you’ll first train a pairwise compatibility classifier.



Figure 2: Examples of a compatible item and an incompatible item.

•    Modify data.py to create a new dataloader that takes in a pair of image inputs (compatible pair and incompatible pair). For example, let’s assume any pair of items in a compatible outfit are considered compatible whereas an incompatible outfit provides negative pairs.

•    Modify model.py to create a new model that takes in a pair of inputs and outputs a compatbility probability for this pair.

•    Split no less than 10% your data for validation. The test set is test pairwise compat hw.txt.

More products