Complete Book & Media Supply, LLC.

Back to Search

ISBN 9798267557214 is currently unpriced. Please contact us for pricing.
Available options are listed below:

Multimodal AI Workflows with Huggging Face: Combining embeddings, image and text models, and retrieval frameworks for advanced anomaly detection and r

AUTHOR	Verran, Emma
PUBLISHER	Independently Published (09/28/2025)
PRODUCT TYPE	Paperback (Paperback)

Description

Master multimodal AI with Hugging Face and build real systems that combine text, images, and retrieval for production-grade workflows.

Modern AI no longer relies on a single data type. Real applications demand models that connect text with images, integrate structured retrieval, and deliver results that scale in production. The challenge for practitioners is moving beyond theory into working pipelines that handle these tasks reliably. Multimodal AI Workflows with Hugging Face shows you exactly how to do it.

This book takes you from core embeddings and vector search to advanced multimodal retrieval-augmented generation, anomaly detection, and recommender systems. Each section connects the underlying concepts with practical code, helping you move from understanding to implementation with confidence.

What you will learn:

How to work with modern text and image embeddings using CLIP, OpenCLIP, SigLIP, and Sentence-Transformers
Practical vector search with FAISS, Weaviate, Milvus, and pgvector
Building multimodal retrieval-augmented generation systems with LlamaIndex and Haystack
Implementing anomaly detection with Anomalib, PaDiM, PatchCore, and MVTec AD
Designing recommendation engines that combine image and text signals
Using vision-language models such as BLIP-2 and Idefics2 for document and chart understanding
Evaluating systems with metrics including AUROC, PRO, Recall, NDCG, and Pixel-level accuracy
Deploying and monitoring multimodal systems in real-world finance, healthcare, manufacturing, and retail scenarios

Code included:
This is a code-heavy guide packed with working Python examples. Every major concept is illustrated with runnable code so you can build your own retrieval, anomaly detection, and recommendation pipelines directly from the text.

Whether you are a machine learning engineer, data scientist, or developer interested in production-ready AI, this book provides the hands-on foundation you need to connect multimodal models with real industry use cases.

Grab your copy today and start building the next generation of AI systems.

Product Format

Product Details

ISBN-13: 9798267557214

Binding: Paperback or Softback (Trade Paperback (Us))

Content Language: English

More Product Details

Page Count: 320

Carton Quantity: 12

Product Dimensions: 7.00 x 0.67 x 10.00 inches

Weight: 1.23 pound(s)

Country of Origin: US

Subject Information

BISAC Categories

Computers | Artificial Intelligence - General

Descriptions, Reviews, Etc.

publisher marketing

Master multimodal AI with Hugging Face and build real systems that combine text, images, and retrieval for production-grade workflows.

What you will learn:

How to work with modern text and image embeddings using CLIP, OpenCLIP, SigLIP, and Sentence-Transformers
Practical vector search with FAISS, Weaviate, Milvus, and pgvector
Building multimodal retrieval-augmented generation systems with LlamaIndex and Haystack
Implementing anomaly detection with Anomalib, PaDiM, PatchCore, and MVTec AD
Designing recommendation engines that combine image and text signals
Using vision-language models such as BLIP-2 and Idefics2 for document and chart understanding
Evaluating systems with metrics including AUROC, PRO, Recall, NDCG, and Pixel-level accuracy
Deploying and monitoring multimodal systems in real-world finance, healthcare, manufacturing, and retail scenarios

Grab your copy today and start building the next generation of AI systems.

Out of Stock

+ Receive Inventory Notifications

In Cart!

Paperback