LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>LikeLike

]]>Say “Hi” to the other Ericsson alumni at Shopify.

]]>LikeLike

]]>Guess everything has an end, even things that almost haven’t started. On the other hand this is not an end, but a continuation.

I have enjoyed your “Digital” traces, and the only occasion we met f2f.

Will keep an eye on your feeds, and will probably find a reason to comment on one of your future blog posts.

LikeLike

]]>LikeLike

]]>I read your amazing article on triplet loss. I am performing a similar experiment which I believe that triplet loss will have. But I don’t understand why it is not working out for me. Instead of going to lossless triplet loss. I thought to make triplet loss work first to some extent.

Now the task is to classify similar and not similar items from a vector space and then do the clustering using annoy with euclidean, which will end in making clusters of similar items.

The entire process –

I have 300D dimensional word vectors, ran on the universal encyclopedia, and there could be huge number of classes to classify, like, animal, vegetable, fruits, etc.

Now, what I did was to create the training set and testing set, the training set contains anchor, positive and negative vectors.

My implementation of Triplet Loss –

def triplet_loss(y_true, y_pred):

anchor = y_pred[:, 0:300]

positive = y_pred[:, 300:600]

negative = y_pred[:, 600:900]

# distance between the anchor and the positive

pos_dist = K.sum(K.square(anchor – positive), axis=1)

# distance between the anchor and the negative

neg_dist = K.sum(K.square(anchor – negative), axis=1)

# compute loss

alpha = 0.5

basic_loss = pos_dist – neg_dist + alpha

loss = K.mean(K.maximum(basic_loss, 0.0))

return loss

My network –

def create_base_network(input_dim=300):

t = ‘tanh’

model = Sequential()

model.add(Dense(input_dim, input_shape=(input_dim, ), activation=t))

model.add(Dense(600, activation=t))

model.add(Dense(input_dim, activation=t))

return model

def init_model(self):

base_network = create_base_network()

anchor_in = Input(shape=(300,))

positive_in = Input(shape=(300,))

negative_in = Input(shape=(300,))

anchor_out = base_network(anchor_in)

positive_out = base_network(positive_in)

negative_out = base_network(negative_in)

merged_vector = concatenate([anchor_out, positive_out, negative_out], axis=1)

model = Model(inputs=[anchor_in, positive_in, negative_in], outputs=merged_vector)

return model

Training –

model.fit(train_data, np.zeros(),

epochs=100, shuffle=True, steps_per_epoch=None, batch_size=128,

)

Total Number of training samples is ~7k

Note – the labels i am passing in the model.fit is zeroes, because it doesn’t matter for the triplet loss.

When training the model, the loss converges to 0 after 10 epochs on an average. But the actual separation is not even affected.

The test is being performed in the following ways –

To test the separation between two similar vectors and non similar vectors, you just need two vectors. We can not use this exact model, so what I did was extracted the weights of the base network and in a different script made the exact same network so that I can set the learned weights. My expectation here is that the learned weights should know what separation to make between the two input vectors.

But, it turns out that I does not make any separation.

For measuring the separation I am using a metric as Cohen’ D value.

= {\displaystyle {\frac {\text{mean difference}}{\text{standard deviation}}}}

Image result for cohen’s d

In separation between the samples in the input vector space is 1, ideally after getting transformed this value should become 4, if only taking the -2-sigma, +2-sigma solution space.

But, while testing the model to see the separation, the value resulted is 1.

If you don’t even train the model, still the value is 1, which is not expected, because the weights will be completely randomised.

I have got no sense left to reason, what is happening with triplet loss, why is not able to perform the job.

I have been dealing with this situation from a long time. Please share your thoughts on this. Any help is very much appreciated.

Thank You,

Shivam Srivastava

Hi,

I read your amazing article on triplet loss. I am performing a similar experiment which I believe that triplet loss will have. But I don’t understand why it is not working out for me. Instead of going to lossless triplet loss. I thought to make triplet loss work first to some extent.

Now the task is to classify similar and not similar items from a vector space and then do the clustering using annoy with euclidean, which will end in making clusters of similar items.

The entire process –

I have 300D dimensional word vectors, ran on the universal encyclopedia, and there could be huge number of classes to classify, like, animal, vegetable, fruits, etc.

Now, what I did was to create the training set and testing set, the training set contains anchor, positive and negative vectors.

My implementation of Triplet Loss –

def triplet_loss(y_true, y_pred):

anchor = y_pred[:, 0:300]

positive = y_pred[:, 300:600]

negative = y_pred[:, 600:900]

# distance between the anchor and the positive

pos_dist = K.sum(K.square(anchor – positive), axis=1)

# distance between the anchor and the negative

neg_dist = K.sum(K.square(anchor – negative), axis=1)

# compute loss

alpha = 0.5

basic_loss = pos_dist – neg_dist + alpha

loss = K.mean(K.maximum(basic_loss, 0.0))

return loss

My network –

def create_base_network(input_dim=300):

t = ‘tanh’

model = Sequential()

model.add(Dense(input_dim, input_shape=(input_dim, ), activation=t))

model.add(Dense(600, activation=t))

model.add(Dense(input_dim, activation=t))

return model

def init_model(self):

base_network = create_base_network()

anchor_in = Input(shape=(300,))

positive_in = Input(shape=(300,))

negative_in = Input(shape=(300,))

anchor_out = base_network(anchor_in)

positive_out = base_network(positive_in)

negative_out = base_network(negative_in)

merged_vector = concatenate([anchor_out, positive_out, negative_out], axis=1)

model = Model(inputs=[anchor_in, positive_in, negative_in], outputs=merged_vector)

return model

Training –

model.fit(train_data, np.zeros(),

epochs=100, shuffle=True, steps_per_epoch=None, batch_size=128,

)

Total Number of training samples is ~7k

Note – the labels i am passing in the model.fit is zeroes, because it doesn’t matter for the triplet loss.

When training the model, the loss converges to 0 after 10 epochs on an average. But the actual separation is not even affected.

The test is being performed in the following ways –

To test the separation between two similar vectors and non similar vectors, you just need two vectors. We can not use this exact model, so what I did was extracted the weights of the base network and in a different script made the exact same network so that I can set the learned weights. My expectation here is that the learned weights should know what separation to make between the two input vectors.

But, it turns out that I does not make any separation.

For measuring the separation I am using a metric as Cohen’ D value.

= {\displaystyle {\frac {\text{mean difference}}{\text{standard deviation}}}}

Image result for cohen’s d

In separation between the samples in the input vector space is 1, ideally after getting transformed this value should become 4, if only taking the -2-sigma, +2-sigma solution space.

But, while testing the model to see the separation, the value resulted is 1.

If you don’t even train the model, still the value is 1, which is not expected, because the weights will be completely randomised.

I have got no sense left to reason, what is happening with triplet loss, why is not able to perform the job.

I have been dealing with this situation from a long time. Please share your thoughts on this. Any help is very much appreciated.

Thank You,

Shivam Srivastava

LikeLike

]]>