multimodal

In the pharmaceutical industry, biotechnology and healthcare companies face an unprecedented challenge for efficiently managing and analyzing vast amounts of drug-related data from diverse sources. Traditional data analysis methods prove inadequate for processing complex medical documentation that includes a mix of text, images, graphs, and tables. Amazon Bedrock offers featuresContinue Reading

Modern enterprises are rich in data that spans multiple modalities—from text documents and PDFs to presentation slides, images, audio recordings, and more. Imagine asking an AI assistant about your company’s quarterly earnings call: the assistant should not only read the transcript but also “see” the charts in the presentation slidesContinue Reading

Interferometers, devices that can modulate aspects of light, play the important role of modulating and switching light signals in fiber-optic communications networks and are frequently used for gas sensing and optical computing. Now, applied physicists at the Harvard John A. Paulson School of Engineering and Applied Sciences (SEAS) have inventedContinue Reading

Organizations today deal with vast amounts of unstructured data in various formats including documents, images, audio files, and video files. Often these documents are quite large, creating significant challenges such as slower processing times and increased storage costs. Extracting meaningful insights from these diverse formats in the past required complexContinue Reading

This post is co-written with Ken Tsui, Edward Tsoi and Mickey Yip from Apoidea Group. The banking industry has long struggled with the inefficiencies associated with repetitive processes such as information extraction, document review, and auditing. These tasks, which require significant human resources, slow down critical operations such as KnowContinue Reading

Multimodal fine-tuning represents a powerful approach for customizing foundation models (FMs) to excel at specific tasks that involve both visual and textual information. Although base multimodal models offer impressive general capabilities, they often fall short when faced with specialized visual tasks, domain-specific content, or particular output formatting requirements. Fine-tuning addressesContinue Reading

Amazon Bedrock Guardrails announces the general availability of image content filters, enabling you to moderate both image and text content in your generative AI applications. Previously limited to text-only filtering, this enhancement now provides comprehensive content moderation across both modalities. This new capability removes the heavy lifting required to buildContinue Reading

Gartner predicts that “by 2027, 40% of generative AI solutions will be multimodal (text, image, audio and video) by 2027, up from 1% in 2023.” The McKinsey 2023 State of AI Report identifies data management as a major obstacle to AI adoption and scaling. Enterprises generate massive volumes of unstructured data,Continue Reading

This is a guest post authored by the team at ByteDance. ByteDance is a technology company that operates a range of content platforms to inform, educate, entertain, and inspire people across languages, cultures, and geographies. Users trust and enjoy our content platforms because of the rich, intuitive, and safe experiencesContinue Reading

This post is co-written with Andrés Vélez Echeveri and Sean Azlin from OfferUp. OfferUp is an online, mobile-first marketplace designed to facilitate local transactions and discovery. Known for its user-friendly app and trust-building features, including user ratings and in-app chat, OfferUp enables users to buy and sell items and exploreContinue Reading