Categories
NVIDIA News

Invitation to NVIDIA GTC 2025: Explore the Future of AI

Join NVIDIA GTC 2025, the world’s premier event for artificial intelligence, high-performance computing, and innovation. Discover the latest breakthroughs, connect with industry experts, and see how cutting-edge AI is solving today’s biggest challenges.

📅 Date: March 17–21, 2025
📍 Location: San Jose, California & Online

Don’t miss exclusive keynotes, hands-on workshops, and networking opportunities with top AI professionals. Register now and shape the future of AI!

🔗 More information: NVIDIA GTC 2025.

Categories
NVIDIA News

DeepSeek-R1 Now Live With NVIDIA NIM

Source: DeepSeek-R1 Now Live With NVIDIA NIM | NVIDIA Blog

To help developers securely experiment with DeepSeek-R1 capabilities and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an NVIDIA NIM microservice preview on build.nvidia.com. The DeepSeek-R1 NIM microservice can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system.

Developers can test and experiment with the application programming interface (API), which is expected to be available soon as a downloadable NIM microservice, part of the NVIDIA AI Enterprise software platform.

The DeepSeek-R1 NIM microservice simplifies deployments with support for industry-standard APIs. Enterprises can maximize security and data privacy by running the NIM microservice on their preferred accelerated computing infrastructure. Using NVIDIA AI Foundry with NVIDIA NeMo software, enterprises will also be able to create customized DeepSeek-R1 NIM microservices for specialized AI agents.

Read more on DeepSeek-R1 Now Live With NVIDIA NIM | NVIDIA Blog.

Categories
NVIDIA News

Fast Forward to Generative AI With NVIDIA Blueprints

NVIDIA Expands AI Workflows With NVIDIA NIM™ and NVIDIA Blueprints

Source: https://blogs.nvidia.com/blog/nim-agent-blueprints/

NVIDIA offers a wide range of software, including NIM (NVIDIA Inference Microservices) and NVIDIA Blueprints, to simplify the deployment of generative AI across industries. NVIDIA NIM™ provides optimized, cloud-native inference microservices for seamless integration of AI models, while NVIDIA Blueprints offer pre-built workflows for faster development and deployment.

These solutions help businesses accelerate AI implementation, reduce infrastructure complexity, and enhance productivity. Whether in the cloud, on-premises, or hybrid environments, NVIDIA’s new AI tools provide flexibility and scalability.

Learn more about NVIDIA Blueprints: NVIDIA AI Workflows.

Categories
NVIDIA News

NVIDIA Brings Grace Blackwell AI Supercomputing to Every Desk

Source: NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer’s Fingertips | NVIDIA Newsroom

At CES 2025, NVIDIA introduced Project DIGITS, a personal AI supercomputer designed to provide AI researchers, data scientists, and students with desktop access to the NVIDIA Grace Blackwell platform. Central to this system is the new NVIDIA GB10 Grace Blackwell Superchip, delivering up to 1 petaflop of AI performance at FP4 precision. The GB10 integrates an NVIDIA Blackwell GPU with the latest CUDA® cores and fifth-generation Tensor Cores, connected via NVLink®-C2C to a high-performance NVIDIA Grace™ CPU comprising 20 Arm-based cores. Developed in collaboration with MediaTek, the GB10 emphasizes power efficiency and performance. Each Project DIGITS unit includes 128GB of unified memory and up to 4TB of NVMe storage, enabling the handling of AI models with up to 200 billion parameters. For larger models, two units can be linked to support up to 405 billion parameters. This setup allows users to develop and run inference on models locally and seamlessly deploy them on accelerated cloud or data center infrastructures.

Categories
NVIDIA News

The Importance of GPU Memory for AI Performance

Source: GPU Memory Essentials for AI Performance | NVIDIA Technical Blog

The NVIDIA blog highlights the critical role of GPU memory capacity in running advanced artificial intelligence (AI) models. Large AI models, such as Llama 2 with 7 billion parameters, require significant amounts of memory. For instance, processing at FP16 precision demands at least 28 GB of memory.

NVIDIA offers high-performance RTX GPUs, such as the RTX 6000 Ada Generation, featuring up to 48 GB of VRAM. These GPUs are designed to handle the largest AI models, enabling local development and execution of complex tasks. Additionally, they come equipped with specialized hardware, including Tensor Cores, which significantly accelerate computations required for AI workloads.

With NVIDIA’s powerful solutions, businesses and researchers can optimize the development and deployment of AI models directly on local devices, opening up new possibilities for advancements in artificial intelligence.

For more details, visit the official NVIDIA blog: developer.nvidia.com.


Interested in learning more about NVIDIA’s powerful solutions? Contact Xenya d.o.o., and we’ll be happy to help you find the right solution for your needs!

Categories
NVIDIA News

An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio

Source: An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio | NVIDIA Technical Blog

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across multiple modalities, including text, images, tables, audio, video, and more. In NVIDIA previous post, An Easy Introduction to Multimodal Retrieval-Augmented Generation, authors discussed how to tackle text and images. This post extends this conversation to audio and videos. Specifically, they explore how to build a multimodal RAG pipeline to search information in videos.

Read more on An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio | NVIDIA Technical Blog.

Categories
NVIDIA News

Creating RAG-Based Question-and-Answer LLM Workflows at NVIDIA

Source: Creating RAG-Based Question-and-Answer LLM Workflows at NVIDIA

The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system architectures. NVIDIA work, using AI for internal operations, has led to several important findings for finding alignment between system capabilities and user expectations. 

NVIDIA found that regardless of the intended scope or use case, users generally want to be able to execute non-RAG tasks like performing document translation, editing emails, or even writing code. A vanilla RAG application might be implemented so that it executes a retrieval pipeline on every message, leading to excess usage of tokens and unwanted latency as irrelevant results are included.

Read more on Creating RAG-Based Question-and-Answer LLM .

Categories
NVIDIA News

What Is Agentic AI?

Source: What Is Agentic AI?

Agentic AI uses sophisticated reasoning and iterative planning to autonomously solve complex, multi-step problems.

AI chatbots use generative AI to provide responses based on a single interaction. A person makes a query and the chatbot uses natural language processing to reply.

The next frontier of artificial intelligence is agentic AI, which uses sophisticated reasoning and iterative planning to autonomously solve complex, multi-step problems. And it’s set to enhance productivity and operations across industries.

Agentic AI systems ingest vast amounts of data from multiple sources to independently analyze challenges, develop strategies and execute tasks like supply chain optimization, cybersecurity vulnerability analysis and helping doctors with time-consuming tasks.

View more on What Is Agentic AI?

Categories
NVIDIA News

Access to NVIDIA NIM Now Available Free to Developer Program Members

Source: Access to NVIDIA NIM Now Available Free to Developer Program Members

The ability to use simple APIs to integrate pretrained AI foundation models into products and experiences has significantly increased developer usage of LLM endpoints and application development frameworks. NVIDIA NIM enables developers and engineering teams to rapidly deploy their own AI model endpoints for the secure development of accelerated generative AI applications using popular development tools and frameworks.

Developers said they want easier access to NIM for development purposes, so NVIDIA is excited to provide free access to downloadable NIM microservices for development, testing, and research to over 5M NVIDIA Developer Program members. Members of the program are provided comprehensive resources, training, tools, and a community of experts that help build accelerated applications and solutions.

View more on Access to NVIDIA NIM Now Available Free to Developer Program Members.

Categories
NVIDIA News

A Simple Guide to Deploying Generative AI with NVIDIA NIM

Source: A Simple Guide to Deploying Generative AI with NVIDIA NIM

Whether you’re working on-premises or in the cloud, NVIDIA NIM microservices provide enterprise developers with easy-to-deploy optimized AI models from the community, partners, and NVIDIA. Part of NVIDIA AI Enterprise, NIM offers a secure, streamlined path forward to iterate quickly and build innovations for world-class generative AI solutions.

Using a single optimized container, you can easily deploy a NIM microservice in under 5 minutes on accelerated NVIDIA GPU systems in the cloud or data center, or on workstations and PCs. Alternatively, if you want to avoid deploying a container, you can begin prototyping your applications with NIM APIs from the NVIDIA API Catalog

View more on A Simple Guide to Deploying Generative AI with NVIDIA NIM

css.php