{"id":4776,"date":"2025-07-11T10:03:52","date_gmt":"2025-07-11T10:03:52","guid":{"rendered":"https:\/\/cloudxlab.com\/blog\/?p=4776"},"modified":"2025-10-09T08:35:53","modified_gmt":"2025-10-09T08:35:53","slug":"quality-of-embeddings-triplet-loss-atharv_katkar","status":"publish","type":"post","link":"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/","title":{"rendered":"Quality of Embeddings &amp; Triplet Loss"},"content":{"rendered":"\n<p class=\"has-small-font-size\" style=\"line-height:0.9\">Author: <a href=\"https:\/\/www.linkedin.com\/in\/atharv-katkar-58ba75213\">Atharv Katkar<\/a> Linkedin<\/p>\n\n\n\n<p class=\"has-small-font-size\" style=\"line-height:0.9\">Directed by: Sandeep Giri<\/p>\n\n\n\n<p><strong>OVERVIEW<\/strong>:<\/p>\n\n\n\n<p>In Natural Language Processing (NLP), embeddings transform human language into numerical vectors. These are usually arrays of multiple dimensions &amp; have schematic meaning based on their previous training text corpus The quality of these embeddings directly affects the performance of search engines, recommendation systems, chatbots, and more.<\/p>\n\n\n\n<p>But here&#8217;s the problem:<\/p>\n\n\n\n<p>Not all embeddings are created equal.<\/p>\n\n\n\n<p>So how do we measure their quality?<\/p>\n\n\n\n<p><strong>To Identify the quality of embeddings i conducted one experiment:<\/strong><\/p>\n\n\n\n<p>I took 3 leading (Free) Text \u2192 Embedding pretrained models which worked differently &amp; provided a set of triplets and found the triplets loss to compare the contextual&nbsp; importance of each one.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p>\u2028<strong>1) Sentence-BERT (SBERT)<\/strong><\/p>\n\n\n\n<p>Transformer-based Captures deep sentence-level semantics:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> from sentence_transformers import SentenceTransformer\n model = SentenceTransformer('all-MiniLM-L6-v2') <\/pre>\n\n\n\n<p><strong>2) Universal Sentence Encoder (USE)<\/strong><\/p>\n\n\n\n<p>TensorFlow Encoder Good general-purpose semantic encoding:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> import tensorflow_hub as hub\n model2 = hub.load(\"https:\/\/tfhub.dev\/google\/universal-sentence-encoder\/4\")\n embeddings = model2([\"Cardiac arrest\"]) \n model2 = hub.load(\"https:\/\/tfhub.dev\/google\/universal-sentence-encoder\/4\")\n embeddings = model2([\"Cardiac arrest\"]) <\/pre>\n\n\n\n<p><strong>3) FastText (by Facebook AI)<\/strong><\/p>\n\n\n\n<p>Word-based Lightweight, fast, but lacks context:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> fasttext.util.download_model('en', if_exists='ignore')\n ft3 = fasttext.load_model('cc.en.300.bin')\n vec = ft3.get_word_vector(\"Cardiac arrest\")\n when i compared the sizes of output produced by them are different for each one\n <strong>(384,), (1, 512), (300,)<\/strong> <\/pre>\n\n\n\n<h5><strong>GOALS<\/strong><\/h5>\n\n\n\n<ol><li>To compare them using a triplet-based evaluation approach using triplet loss.<\/li><li>Identify the Understanding of these around medical terminologies<\/li><\/ol>\n\n\n\n<p><strong>CONCEPTS<\/strong><\/p>\n\n\n\n<p><strong>What is Triplet Loss?<\/strong><\/p>\n\n\n\n<p>Triplet loss works with a 3-part input:<\/p>\n\n\n\n<p>Anchor: The base sentence or phrase<\/p>\n\n\n\n<p>Positive: A semantically similar phrase<\/p>\n\n\n\n<p>Negative: A semantically absurd or unrelated phrase<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><tbody><tr><td>Anchor<\/td><td>Positive<\/td><td>Negative<\/td><\/tr><tr><td>tuberculosis<\/td><td>Lung infection<\/td><td>test tube accident&nbsp;<\/td><\/tr><tr><td>cardiac arrest<\/td><td>heart attack<\/td><td>cardi b arrest<\/td><\/tr><tr><td>asthma<\/td><td>respiratory condition<\/td><td>Spiritual awakening&nbsp;<\/td><\/tr><\/tbody><\/table><figcaption>Samples From my test dataset<\/figcaption><\/figure>\n\n\n\n<p>The goal is to push the anchor close to the positive and far from the negative in embedding space.<\/p>\n\n\n\n<p class=\"has-text-align-center has-dark-red-color has-text-color has-medium-font-size\"><strong>TripletLoss = max (d (a , p) \u2212 d (a , n) + margin , 0 )<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> a = anchor vector\n <strong>p<\/strong> = positive vector (should be close to\n  anchor)\n <strong>n<\/strong> = negative vector (should be far from\n  anchor)\n <strong>d(x,y)<\/strong> = cosine distance\n <strong>margin<\/strong> = a buffer that forces the negative to be not just farther, but significantly farther <\/pre>\n\n\n\n<p><strong>What is Cosine Similarity?<\/strong><\/p>\n\n\n\n<p>Cosine similarity is a measure of how similar two vectors are \u2014 based on the angle between them rather than their magnitude. In the context of NLP, vectors represent words or sentences as embeddings.<\/p>\n\n\n\n<p class=\"has-text-align-center\" style=\"line-height:1.1\">&nbsp;&nbsp;<strong>Cosine Similarity (A.B) = &nbsp; A.<\/strong>B \/ <strong>||A||.||B||<\/strong><\/p>\n\n\n\n<p class=\"has-text-align-center\"><strong>CosineDistance(A,B) = 1 \u2212 CosineSimilarity(A,B)<\/strong><\/p>\n\n\n\n<p><strong>What is Margin?<\/strong><\/p>\n\n\n\n<p>The margin is a safety cushion.<\/p>\n\n\n\n<p>If margin = 0.2, then even if the negative is slightly farther than the positive, the model still gets a penalty unless it\u2019s at least 0.2 farther.<\/p>\n\n\n\n<h5><strong>Testing The accuracy<\/strong><\/h5>\n\n\n\n<h5 style=\"font-size:19px\"><a href=\"https:\/\/docs.google.com\/document\/d\/1XpjGZTKt_LybkUIjUVHOeHAXkhVUN6lYGnuHkTbO_L4\/edit?tab=t.e34yzpa6ckdw\"><strong>TEST-SET<\/strong><\/a>(click on test set see set)<\/h5>\n\n\n\n<p>We ran each model over a set of ~50 curated triplets<\/p>\n\n\n\n<p>Calculated:<\/p>\n\n\n\n<p>Anchor\u2013Positive distance (AP)<\/p>\n\n\n\n<p>Anchor\u2013Negative distance (AN)<\/p>\n\n\n\n<p>Triplet loss<\/p>\n\n\n\n<p>Visualized both individual performance per triplet and overall averages<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> (\"asthma\", \"respiratory condition\", \"spiritual awakening\"),\n (\"pneumonia\", \"lung infection\", \"foggy window\"),\n # General &amp; Internal Medicine\n (\"diabetes\", \"high blood sugar\", \"candy addiction\"),\n (\"arthritis\", \"joint inflammation\", \"rusty hinge\"),\n 50+ such examples <\/pre>\n\n\n\n<p><strong>Results:<\/strong><\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img width=\"981\" height=\"590\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2025\/07\/apan.png\" alt=\"\" class=\"wp-image-4777\" \/><figcaption>less the cosine distance more accurate the understanding of the context AP need be less as we can see of SBERT &amp; USE is less where FASTTEXT is keep messing up<\/figcaption><\/figure>\n\n\n\n<p><em>Using PCA, we visualized where each model placed the anchor, positive, and negative in space.<\/em><br><em>Insert 2D scatter plot<\/em><br><em>You can actually see the anchor and positive clustering together especially in SBERT\u2019s case while the negative floats far away.<\/em> In below image<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img width=\"984\" height=\"590\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2025\/07\/updatedsbert.png\" alt=\"\" class=\"wp-image-4788\" \/><figcaption>for better understanding you can see the Euclidean distance where  anchor &amp; positive are relatively closer than negative (SBERT)<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img width=\"986\" height=\"590\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2025\/07\/USEupdated.png\" alt=\"\" class=\"wp-image-4789\" \/><figcaption>The USE is also able to classify the hard medical terminology quite precisely like diabetes= high blood sugar not a candy addiction<\/figcaption><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img width=\"986\" height=\"590\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2025\/07\/FASTTEXT.png\" alt=\"\" class=\"wp-image-4780\" \/><figcaption>On the other hand the Fasttext basically classified it at same place like cardiac arrest = heart attack = cardi b arrest which is wrong <\/figcaption><\/figure>\n\n\n\n<p>Triplet Example wise Score&nbsp;:<\/p>\n\n\n\n<figure class=\"wp-block-image size-large\"><img width=\"1188\" height=\"590\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2025\/07\/Frame-34-1.png\" alt=\"\" class=\"wp-image-4790\" \/><figcaption>the SBERT ,USE performed efficiently if we not consider 2-3 examples where fasttext not able to understand any of context (just understood characters hierarchy  &amp; similarity)<\/figcaption><\/figure>\n\n\n\n<p>The SBERT &amp; USE performed good as we can see using few interpretations and loss tracking of triplets<\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><strong>CONCLUSION<\/strong>:<\/p>\n\n\n\n<p>What We Learned<\/p>\n\n\n\n<p>SBERT is highly reliable for understanding sentence-level meaning<\/p>\n\n\n\n<p>USE performs reasonably well and is easy to use with TensorFlow<\/p>\n\n\n\n<p>FastText, while fast, struggles with context and full sentences<\/p>\n\n\n\n<p><strong>Visual Results<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> Triplet Loss (Lower = Better)\n <strong>SBERT \u00a0 \u00a0 : <\/strong>0.0381\n <strong>USE \u00a0 \u00a0 \u00a0 : <\/strong>0.0320\n <strong>FastText\u00a0 : <\/strong>0.2175 <\/pre>\n\n\n\n<p class=\"has-text-align-center\"><img width=\"345\" height=\"302\" src=\"https:\/\/lh7-rt.googleusercontent.com\/docsz\/AD_4nXf1qpbx4dbfRvF0N1UBTJ-BvIxvuu5EeGbTvgsa7QDanI6REYtcJ_Yncspyy1vsQOXAeHpFryCyaDJPys4t0RpH2SDLwoItvS_dZW5n6dcJE9UZaqPOgT5BOWwp67FyJgD0RuSwyg?key=7u82-0bEyxqjY00X87VGGQ\"> <\/p>\n\n\n\n<p>If you\u2019re building Search engines , Recommendation systems , Chatbots &#8230;or anything involving meaning, good embeddings are key. Triplet loss is a simple yet powerful way to test how smart your model really is. I insist all of the preprocessing stages of making NLP models or context based systems the Triplet Loss needs to be used to select optimal pretrained or trained model.<\/p>\n\n\n\n<p><\/p>\n\n\n\n<div class=\"wp-block-columns\">\n<div class=\"wp-block-column\" style=\"flex-basis:100%\">\n<p><a href=\"https:\/\/github.com\/Atharvkatkar123\/IIT-ROORKEE\/blob\/main\/Embedding.ipynb?short_path=b531a9a\">The source code<\/a> if want to conduct the experiments. Good Luck..!<\/p>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Author: Atharv Katkar Linkedin Directed by: Sandeep Giri OVERVIEW: In Natural Language Processing (NLP), embeddings transform human language into numerical vectors. These are usually arrays of multiple dimensions &amp; have schematic meaning based on their previous training text corpus The quality of these embeddings directly affects the performance of search engines, recommendation systems, chatbots, and &hellip; <a href=\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Quality of Embeddings &amp; Triplet Loss&#8221;<\/span><\/a><\/p>\n","protected":false},"author":49,"featured_media":4782,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[67,24,29,28,30],"tags":[269,264,260,265,267,266,259,94,261,270,268,271,262,263],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v16.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Quality of Embeddings &amp; Triplet Loss | CloudxLab Blog<\/title>\n<meta name=\"description\" content=\"Dive into Triplet Loss and sentence embeddings with SBERT, USE, and FastText. In this blog, Task Master explores how well these models understand language using real-world, medically themed examples and visualizations to evaluate embedding quality.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Quality of Embeddings &amp; Triplet Loss | CloudxLab Blog\" \/>\n<meta property=\"og:description\" content=\"Dive into Triplet Loss and sentence embeddings with SBERT, USE, and FastText. In this blog, Task Master explores how well these models understand language using real-world, medically themed examples and visualizations to evaluate embedding quality.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/\" \/>\n<meta property=\"og:site_name\" content=\"CloudxLab Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cloudxlab\" \/>\n<meta property=\"article:published_time\" content=\"2025-07-11T10:03:52+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-10-09T08:35:53+00:00\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2025\/07\/20250705_1849_Embedding-Insights_simple_compose_01jzdbrs16ewt8k7ag1pbgfdjg.png\" \/>\n<meta name=\"twitter:creator\" content=\"@CloudxLab\" \/>\n<meta name=\"twitter:site\" content=\"@CloudxLab\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\">\n\t<meta name=\"twitter:data1\" content=\"5 minutes\">\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#website\",\"url\":\"https:\/\/cloudxlab.com\/blog\/\",\"name\":\"CloudxLab Blog\",\"description\":\"Learn AI, Machine Learning, Deep Learning, Devops &amp; Big Data\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":\"https:\/\/cloudxlab.com\/blog\/?s={search_term_string}\",\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2025\/07\/20250705_1849_Embedding-Insights_simple_compose_01jzdbrs16ewt8k7ag1pbgfdjg.png\",\"contentUrl\":\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2025\/07\/20250705_1849_Embedding-Insights_simple_compose_01jzdbrs16ewt8k7ag1pbgfdjg.png\",\"width\":1536,\"height\":1024},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/#webpage\",\"url\":\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/\",\"name\":\"Quality of Embeddings &amp; Triplet Loss | CloudxLab Blog\",\"isPartOf\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/#primaryimage\"},\"datePublished\":\"2025-07-11T10:03:52+00:00\",\"dateModified\":\"2025-10-09T08:35:53+00:00\",\"author\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/#\/schema\/person\/7900f6b02900d0320a9def526e7e7b3f\"},\"description\":\"Dive into Triplet Loss and sentence embeddings with SBERT, USE, and FastText. In this blog, Task Master explores how well these models understand language using real-world, medically themed examples and visualizations to evaluate embedding quality.\",\"breadcrumb\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/\",\"url\":\"https:\/\/cloudxlab.com\/blog\/\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"position\":2,\"item\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/quality-of-embeddings-triplet-loss-atharv_katkar\/#webpage\"}}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#\/schema\/person\/7900f6b02900d0320a9def526e7e7b3f\",\"name\":\"Atharv Katkar\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/6cc2397c932fa4093b620d68364aa1b4?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/6cc2397c932fa4093b620d68364aa1b4?s=96&d=mm&r=g\",\"caption\":\"Atharv Katkar\"}}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","_links":{"self":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/4776"}],"collection":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/users\/49"}],"replies":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/comments?post=4776"}],"version-history":[{"count":6,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/4776\/revisions"}],"predecessor-version":[{"id":4800,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/4776\/revisions\/4800"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/media\/4782"}],"wp:attachment":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/media?parent=4776"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/categories?post=4776"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/tags?post=4776"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}