Perform multimodal image search and visualization using CLIP, ChromaDB, UMAP and Bokeh

In this blog post, I am going to show you how to perform image search against the Unsplash Lite dataset with 25k photos, using both text and image queries. To better comprehend our search results, I will then visualize the query and its matches using UMAP dimension reduction and plot it with Bokeh. By the end of this post, we will have generated a plot like this, showing both our input query image, and photos in the Unsplash Lite dataset that are closest in semantic meaning to it....

May 31, 2025 · 15 min · 3172 words · Tiffena Kou

Generate meaningful insights from Japanese content with Topic Modeling using BERTopic

Today, information overload has become a daily problem. We often need to probe into a large set of information, and uncover common underlying themes so we have a better understanding of what we are facing. For example, we might want to quickly grasp user reviews and see whether customers are loving or hating us, and for each category, what exactly are the biggest areas that we need to focus our...

May 4, 2025 · 19 min · 9090 words · Tiffena Kou