Categories
NVIDIA News

Invitation to NVIDIA GTC 2025: Explore the Future of AI

Join NVIDIA GTC 2025, the world’s premier event for artificial intelligence, high-performance computing, and innovation. Discover the latest breakthroughs, connect with industry experts, and see how cutting-edge AI is solving today’s biggest challenges.

📅 Date: March 17–21, 2025
📍 Location: San Jose, California & Online

Don’t miss exclusive keynotes, hands-on workshops, and networking opportunities with top AI professionals. Register now and shape the future of AI!

🔗 More information: NVIDIA GTC 2025.

Categories
NVIDIA News

DeepSeek-R1 Now Live With NVIDIA NIM

Source: DeepSeek-R1 Now Live With NVIDIA NIM | NVIDIA Blog

To help developers securely experiment with DeepSeek-R1 capabilities and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an NVIDIA NIM microservice preview on build.nvidia.com. The DeepSeek-R1 NIM microservice can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system.

Developers can test and experiment with the application programming interface (API), which is expected to be available soon as a downloadable NIM microservice, part of the NVIDIA AI Enterprise software platform.

The DeepSeek-R1 NIM microservice simplifies deployments with support for industry-standard APIs. Enterprises can maximize security and data privacy by running the NIM microservice on their preferred accelerated computing infrastructure. Using NVIDIA AI Foundry with NVIDIA NeMo software, enterprises will also be able to create customized DeepSeek-R1 NIM microservices for specialized AI agents.

Read more on DeepSeek-R1 Now Live With NVIDIA NIM | NVIDIA Blog.

Categories
NVIDIA News

Fast Forward to Generative AI With NVIDIA Blueprints

NVIDIA Expands AI Workflows With NVIDIA NIM™ and NVIDIA Blueprints

Source: https://blogs.nvidia.com/blog/nim-agent-blueprints/

NVIDIA offers a wide range of software, including NIM (NVIDIA Inference Microservices) and NVIDIA Blueprints, to simplify the deployment of generative AI across industries. NVIDIA NIM™ provides optimized, cloud-native inference microservices for seamless integration of AI models, while NVIDIA Blueprints offer pre-built workflows for faster development and deployment.

These solutions help businesses accelerate AI implementation, reduce infrastructure complexity, and enhance productivity. Whether in the cloud, on-premises, or hybrid environments, NVIDIA’s new AI tools provide flexibility and scalability.

Learn more about NVIDIA Blueprints: NVIDIA AI Workflows.

Categories
NVIDIA News

NVIDIA Brings Grace Blackwell AI Supercomputing to Every Desk

Source: NVIDIA Puts Grace Blackwell on Every Desk and at Every AI Developer’s Fingertips | NVIDIA Newsroom

At CES 2025, NVIDIA introduced Project DIGITS, a personal AI supercomputer designed to provide AI researchers, data scientists, and students with desktop access to the NVIDIA Grace Blackwell platform. Central to this system is the new NVIDIA GB10 Grace Blackwell Superchip, delivering up to 1 petaflop of AI performance at FP4 precision. The GB10 integrates an NVIDIA Blackwell GPU with the latest CUDA® cores and fifth-generation Tensor Cores, connected via NVLink®-C2C to a high-performance NVIDIA Grace™ CPU comprising 20 Arm-based cores. Developed in collaboration with MediaTek, the GB10 emphasizes power efficiency and performance. Each Project DIGITS unit includes 128GB of unified memory and up to 4TB of NVMe storage, enabling the handling of AI models with up to 200 billion parameters. For larger models, two units can be linked to support up to 405 billion parameters. This setup allows users to develop and run inference on models locally and seamlessly deploy them on accelerated cloud or data center infrastructures.

Categories
NVIDIA News

The Importance of GPU Memory for AI Performance

Source: GPU Memory Essentials for AI Performance | NVIDIA Technical Blog

The NVIDIA blog highlights the critical role of GPU memory capacity in running advanced artificial intelligence (AI) models. Large AI models, such as Llama 2 with 7 billion parameters, require significant amounts of memory. For instance, processing at FP16 precision demands at least 28 GB of memory.

NVIDIA offers high-performance RTX GPUs, such as the RTX 6000 Ada Generation, featuring up to 48 GB of VRAM. These GPUs are designed to handle the largest AI models, enabling local development and execution of complex tasks. Additionally, they come equipped with specialized hardware, including Tensor Cores, which significantly accelerate computations required for AI workloads.

With NVIDIA’s powerful solutions, businesses and researchers can optimize the development and deployment of AI models directly on local devices, opening up new possibilities for advancements in artificial intelligence.

For more details, visit the official NVIDIA blog: developer.nvidia.com.


Interested in learning more about NVIDIA’s powerful solutions? Contact Xenya d.o.o., and we’ll be happy to help you find the right solution for your needs!

Categories
News

Xenya Announces Sponsorship at Peering Days 2025 in Split

We are excited to announce that Xenya will be a proud sponsor of the upcoming Peering Days event, scheduled to take place in March 2025 in Split. This prestigious gathering brings together professionals from the internet networking and peering community to share knowledge, explore new technologies, and foster collaborations.

As a leading provider of advanced networking solutions, Xenya’s participation underscores our commitment to enhancing connectivity and fostering technological advancements. The event will feature a series of talks, workshops, and networking sessions that align with our goals of promoting innovative technologies and sustainable growth in the IT sector.

We invite all attendees to join us at the event where our team will be available to discuss the latest trends, share insights, and explore potential collaborations. For more information on the event schedule and registration details, please visit the website at Peering Days.

Stay tuned for updates and we look forward to seeing you in Split!

Categories
News

Integration of NVIDIA BlueField DPUs with WEKA Client Boosts AI Workload Efficiency

Source: Integration of NVIDIA BlueField DPUs with WEKA Client Boosts AI Workload Efficiency | NVIDIA Technical Blog

WEKA and NVIDIA are collaborating to integrate NVIDIA BlueField DPU processing units with WEKA’s data storage platform, enhancing AI workload efficiency. This integration improves data transfer rates, reduces latency, and increases system security by running the WEKA client directly on NVIDIA BlueField DPUs instead of the host server’s CPU. This approach not only boosts performance but also reduces CPU load and enhances security by moving storage operations to the DPU.

The features and discussions of these integrations were highlighted at the Supercomputing 2024 conference, where attendees witnessed firsthand how enhanced data access speeds and efficient workload processing can transform data center operations. For more detailed information, visit the NVIDIA Technical Blog Integration of NVIDIA BlueField DPUs with WEKA Client Boosts AI Workload Efficiency | NVIDIA Technical Blog.

Categories
NVIDIA News

An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio

Source: An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio | NVIDIA Technical Blog

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across multiple modalities, including text, images, tables, audio, video, and more. In NVIDIA previous post, An Easy Introduction to Multimodal Retrieval-Augmented Generation, authors discussed how to tackle text and images. This post extends this conversation to audio and videos. Specifically, they explore how to build a multimodal RAG pipeline to search information in videos.

Read more on An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio | NVIDIA Technical Blog.

Categories
NVIDIA News

Creating RAG-Based Question-and-Answer LLM Workflows at NVIDIA

Source: Creating RAG-Based Question-and-Answer LLM Workflows at NVIDIA

The rapid development of solutions using retrieval augmented generation (RAG) for question-and-answer LLM workflows has led to new types of system architectures. NVIDIA work, using AI for internal operations, has led to several important findings for finding alignment between system capabilities and user expectations. 

NVIDIA found that regardless of the intended scope or use case, users generally want to be able to execute non-RAG tasks like performing document translation, editing emails, or even writing code. A vanilla RAG application might be implemented so that it executes a retrieval pipeline on every message, leading to excess usage of tokens and unwanted latency as irrelevant results are included.

Read more on Creating RAG-Based Question-and-Answer LLM .

Categories
News

Xenya at ASM’24

On Friday, 6 December, the Chamber of Commerce and Industry of Slovenia hosted the 20th annual Automation of Handling and Assembly – ASM ’24.

It is the only specialized professional conference/consultation organized by the Faculty of Mechanical Engineering, University of Ljubljana, which has sparked significant interest among companies regarding the future of manufacturing processes. The event serves as a platform for meetings and discussions, where experts in the fields of automation and industry can actively exchange ideas.

Innovation and competitiveness, along with collaboration, drive economic progress and can position us at the forefront on a global level. Advanced artificial intelligence is indispensable in planning, execution, servicing, or relocating manufacturing processes, as it enables faster planning and optimal execution of all activities.

XENYA d.o.o., as an NVIDIA Elite Partner, this year introduced the NVIDIA Omniverse™ platform, with the aim of providing Slovenian businesses with the best solution for creating, planning, visualizing, and establishing new manufacturing processes.

NVIDIA Omniverse™ is an open platform for collaboration and simulation in a virtual 3D environment, based on the open-source Open USD format. Omniverse allows users to create, develop, and manage complex 3D virtual worlds in real-time, enabling simultaneous changes and shared insights into a common virtual space. Additionally, it performs advanced functions, such as ensuring that objects in the 3D world adhere to physical laws of the real world.

The platform is designed for designers, engineers, and developers to collaboratively work on projects such as 3D models, animations, simulations, and digital twins, leveraging a range of the most renowned third-party software tools in their fields. Since testing in a virtual environment can save significant time and costs, Omniverse is particularly valuable in industries requiring a high degree of realism and precise real-world simulation, such as architecture, engineering, automotive, and other complex industrial manufacturing, as well as the film industry.

Companies such as Coca-Cola, Amazon, BMW, Wistron, and many others are already digitizing their processes with NVIDIA Omniverse™. Learn more at:

css.php