Name: Accelerating Generative AI – Options for Conquering the Dataflow Bottlenecks
Start: 2024-01-24T18:00:00Z
End: 2024-01-24T18:00:56.000Z
Location: BrightTALK
Rating: 4.7

SNIA is a not-for-profit global organization made up of corporations, universities, startups, and individuals. The members collaborate to develop and promote vendor-neutral architectures, standards, and education for management, movement, and security for technologies related to handling and optimizing data. SNIA focuses on the transport, storage, acceleration, format, protection, and optimization of infrastructure for data.

Join Jim Handy of Objective Analysis, Mahesh Natu of the CXL Consortium, and Torry Steed of SMART Modular Technologies as they delve into Compute Express Link® (CXL), a groundbreaking technology that expands server memory beyond traditional limits and boosts bandwidth. CXL enables seamless memory sharing, reduces costs by optimizing resource utilization, and supports different memory types. Discover how CXL is transforming computing systems with its economic and performance benefits, and explore its future impact across various market segments.

Unlocking CXL's Potential: Revolutionizing Server Memory and Performance

In today's world of AI/ML, high-performance computing, and data centers, RDMA (Remote Direct Memory Access) is essential due to its ability to provide high-speed, low-latency data transfer. This technology enhances scalability, reduces CPU overhead, and ensures efficient handling of massive data volumes, making it crucial for modern computing environments. But how does it actually work?

In this session, we'll define RDMA, explore its historical development, and its significant benefits. In addition, we’ll provide an in-depth look at RDMA over Converged Ethernet (RoCE) and InfiniBand (IB), cover the fundamental objects, Verbs API, and wire protocol. 

Key Takeaways:
1. Insights into the historical development and evolution of RDMA technology.
2. Comprehensive understanding of RDMA and its role in modern computing.
3. Detailed exploration of RDMA implementations
4. Explanation of fundamental RDMA components, such as the Verbs API and wire protocol (e.g., IB and RocE)

Join us for an insightful SNIA Data, Storage & Networking webinar on RDMA.

Everything You Wanted to Know About RDMA But Were Too Proud to Ask

As artificial intelligence continues to transform industries, the demands on storage systems are evolving at an unprecedented pace. Join STA for a dynamic roundtable discussion featuring industry experts, including Patrick Kennedy of ServeTheHome, J Metz, Chair of SNIA, and STA board member Jeremiah Tussey. Together, they’ll explore the trends and technologies shaping the future of storage in an AI-driven world and how organizations can prepare for these changes.

What You’ll Learn:
- The impact of AI on storage architectures, including advances in SAS and other technologies.
- How storage scalability and performance requirements are adapting to AI workloads.
- Emerging innovations in storage management and data processing for AI-driven applications.
- Real-world insights into optimizing storage solutions for machine learning and deep learning.
- Predictions for the future of storage in AI, including opportunities and challenges.

Don't miss this opportunity to gain expert insights and join the conversation about the next wave of storage innovation.

Storage Trends in AI 2025

Artificial intelligence and Machine learning (AI/ML) is a hot topic in every business at the moment, and there is a growing dialog about what constitutes an Open Model, is it the weights? Is it the data?

Those are important questions, but equally important is ensuring that the tooling and frameworks to train, validate, fine-tune, and perform inference are open source. Storage systems are a crucial component of these workflows, how can open-source solutions address the needs for high capacity and high performance?

Data is key to any and all AI/ML workflows, without it there would be no data to use as an input for model training, re-evaluation and refinement of models, or even just securely storing models once training is complete, especially if they have taken weeks to produce!

Open source solutions like Ceph can provide almost limitless scaling capabilities, both for performance and capacity. In this webinar, learn how Ceph can be used as the backing store for AI/ML workloads.
We’ll cover:
• The demands of AI on storage systems
• How open source Ceph storage fits into the picture
• How to approach Ceph cluster scaling to meet AI’s needs
• How to get started with Ceph

Ceph Storage in a World of AI/ML Workloads

New Memories like MRAM, ReRAM, PCM, or FRAM are vying to replace embedded flash and, eventually, even embedded SRAM.  Are there other memory technologies threatened with similar fates?  What will the memory market look like in another 20 years?
Catch up on the latest in new memory technologies in this fast-paced, entertaining panel, as we explain what these new memory technologies are, the applications that have already adopted them in the marketplace, their impact on computer architectures and AI, the outlook for important near-term changes, and how economics dictate success or failure.  Noted analysts Jim Handy of Objective Analysis and Tom Coughlin of Coughlin Associates, moderated by Arthur Sainio, SNIA PM Special Interest Group Co-Chair, will present the findings of their latest report as they discuss where emerging memories complement CXL, Chiplets, Processing In Memory, Endpoint AI, and wearables, and they explain the inevitability of a conversion from established technologies to new memory types.

A Deep Look at New Memories

The SNIA Cloud Storage Technologies Initiative (CSTI) conducted a poll early in 2024 during a live webinar “Navigating Complexities of Object Storage Compatibility,” citing 72% of organizations have encountered incompatibility issues between various object storage implementations. These results resulted in a call to action for SNIA to create an open expert community dedicated to resolving these issues and building best practices for the industry.  

Since then, SNIA CSTI partnered with the SNIA Cloud Storage Technical Working Group (TWG) and successfully organized, hosted, and completed the first SNIA Cloud Object Storage Plugfest (multi-vendor interoperability testing), co-located at SNIA Developer Conference (SDC), September 2024, in Santa Clara, CA. Participating Plugfest companies included engineers from Dell, Google, Hammerspace, IBM, Microsoft, NetApp, VAST Data, and Versity Software. Three days of Plugfest testing discovered and resolved issues and included a Birds of a Feather (BoF) session to gain consensus on next steps for the industry. Plugfest contributors are now planning two 2025 Plugfest events in Denver in April and Santa Clara in September.

This webinar will share insights into industry best practices, explain the benefits your implementation may gain with improved compatibility, and welcome your client and server cloud object storage team to join us in building momentum. Join us on November 21st where we’ll discuss:  
• Implications on client applications
• Complexity and variety of APIs 
• Access control mechanisms 
• Performance and scalability requirements
• Real-world incompatibilities found in various object storage implementations
• Missing or incorrect response headers
• Unsupported API calls and unexpected behavior

More information:
• SNIA CSTI https://www.snia.org/groups/csti
• SDC'24 https://www.sniadeveloper.org/
• Cloud Object Storage Plugfest (Sept. 2024)  https://www.sniadeveloper.org/cloud-object-storage-plugfest

Building Community to Tackle Cloud Object Storage Incompatibilities

Join us for an insightful webinar on the transformative impact of AI on networking. This session will delve into the various use cases of AI, the nature of traffic for different workloads, and the network impact of these workloads. We will explore the multiple networking challenges posed by AI and how Ethernet is evolving to meet these demands. Special focus will be given to congestion issues during model training, the role of Ultra Ethernet Consortium (UEC), and the specific requirements related to training large language models (LLMs) and other use cases.

Learning Objectives:
- Understand the different types of Network Topologies typically used with AI workloads.
- Identify the nature of traffic for various AI workloads and their impact on networks.
- Learn about the challenges Ethernet faces with AI workloads and the solutions being implemented.
- Explore a specific use case to see how Ethernet addresses bandwidth and congestion issues.

Don’t miss this opportunity to stay ahead in the rapidly evolving field of AI and networking. Register now to secure your spot and gain valuable insights from industry experts!

Ethernet in the Age of AI: Adapting to New Networking Challenges

This presentation examines the critical role of storage solutions in optimizing AI workloads, with a primary focus on storage-intensive AI training workloads. We will highlight how AI models interact with storage systems during training, focusing on data loading and checkpointing mechanisms. We will explore how AI frameworks like PyTorch utilize different storage connectors to access various storage solutions. Finally, the presentation will delve into the use of file-based storage and object storage in the context of AI training: 
 
Attendees will:
- Gain a clear understanding of the critical role of storage in AI model training workloads
 - Understand how AI models interact with storage systems during training, focusing on data loading and checkpointing mechanisms
- Learn how AI frameworks like PyTorch use different storage connectors to access various storage solutions.
- Explore how file-based storage and object storage is used in AI training

The Critical Role of Storage in Optimizing AI Training Workloads

Unlocking a Sustainable Future for Data Storage

In a world of surging data demands, how can we reduce the environmental toll of storage solutions? Discover the power of the circular economy to reshape the storage industry for a greener tomorrow with this webinar, featuring Jonmichael Hands (Co-Chair SNIA SSD SIG and Board Member, Circular Drive Initiative) and Shruti Sethi (Sr. PM at Microsoft and Leadership Team, Open Compute Project-Sustainability).

Key Highlights:

- Circular Drive Initiative: Rethink the lifecycle of storage devices—from design to end-of-life—to unlock significant environmental benefits.
- Media Sanitization Best Practices: Securely erase data to enable reuse, extend device life, and cut down on e-waste. Explore techniques like:
- - Cryptographic erase
- - Block erase
- - Overwrite methods
- Compliance & Transparency: Learn how standards like IEEE 2883-2022 and ISO/IEC 27040:2024 guide secure data disposal, with organizations like SERI R2 and ADISA leading the charge in setting industry benchmarks.
- Carbon Accounting in Storage: Understand how tracking and reducing carbon emissions in storage aligns with global sustainability goals.

This session is your roadmap to driving real change by adopting circular economy principles, embracing advanced sanitization methods, and leveraging carbon accounting to reduce the industry’s environmental footprint.

Advancing Sustainable Storage: The Impact of the Circular Economy, Media Sanitization Policies, and Carbon Accounting

The key to optimal SAN performance is managing performance impacts due to congestion. The Fibre Channel industry introduced Fabric Notifications as a key resiliency mechanism for storage networks in 2021 to combat congestion, link integrity, and delivery errors. These functions have been implemented by the ecosystem and have enhanced the overall user experience when deploying Fibre Channel SANs. This webinar explores the evolution of Fabric Notifications and the available solutions of this exciting new technology. In this webinar, you will learn the following:
• The state of Fabric Notifications as defined by the Fibre Channel standards.
• The mechanisms and techniques for implementing Fabric Notifications.
• The currently available solutions deploying Fabric Notifications.

The Evolution of Congestion Management in Fibre Channel

This session explores the evolution of transactions, implementation challenges, and insights into distributed database environments. Whether you're a database enthusiast or a tech enthusiast, this presentation offers valuable insights into the world of database management, and will include:
• Historical perspective of transactions 
• Implementing transactions 
• Challenges and trade-offs in ACID properties 
• Distributed transactions in modern databases like Amazon Aurora, DynamoDB, and Google Spanner
Key Takeaways: Understanding the evolution of transactions in databases Insights into the challenges of implementing ACID properties and exploration of distributed transaction models in leading database systems

About the Speaker: Shiv Jha of Nutanix is a seasoned professional with expertise in databases, streams, and application architecture. With a passion for open-source software and community engagement, Shiv brings a wealth of knowledge and experience to the table.

Navigating Transactions: ACID Complexity in Modern Databases

Large Language models (LLMs) based on Transformers architecture have demonstrated the state-of-the-art performance in different code generation benchmarks such as MBPP and HumanEval. In this talk, we will demonstrate how we have used open source LLM models to develop a code generation workflow that can be trained internally in an on-prem infrastructure and used for improving developer productivity by aiding in tasks such as unittest generation, code documentation, code refactoring, code translation, search, and code alignment.

Empowering Developers: Exploring LLM Models for Code Generation

The digital landscape is in hyperdrive, demanding an IT metamorphosis that transcends mere tools. Enter AIOps – not just a technological upgrade, but a paradigm shift redefining how we approach IT operations. This presentation delves beyond the nuts and bolts, unveiling AIOps as a revolution that infuses AI's intelligence into the very fabric of IT thinking and processes.
Key Themes:
• From Dev to Production and Reactive to Proactive: Revolutionizing the IT Mindset: We'll move beyond the "fix it when it breaks" mentality, embracing a shift left, a future-proof approach where AI analyzes risk, anticipates issues, prescribes solutions, and learns continuously.
• Beyond Siloed Solutions: Embracing Holistic Collaboration:  AIOps fosters seamless integration across departments, applications, and infrastructure, promoting real-time visibility and unified action.
• Automating the process: From Insights to Intelligent Action: Dive into the world of self-healing IT, where AI-powered workflows and automation resolve issues and optimize performance without human intervention.

AIOps: Reactive to Proactive – Revolutionizing the IT Mindset

Data is one of the most critical resources of our time. Storage for data is always a critical architectural element in any data center. There are considerations for storage: performance, scalability, reliability, etc. A decade ago, the market was aggressively embracing public storage because of its agility and scalability. In the last few years, people are rethinking that approach, moving toward on-premises storage with cloud consumption models. The new cloud native architecture on-premises has the promise of the traditional data center’s security and reliability with cloud agility and scalability. 
Ceph, an enterprise unified SDS, is the perfect solution for this cloud native on-premises architecture. In this webinar, we will describe how Ceph is uniquely qualified to satisfy this architecture and how the technology community is investing to enable the vision of “Ceph, the Linux of Storage Today”.

Ceph: The Linux of Storage Today

What new storage trends are developing in the coming year? What applications and other factors are driving these trends? Learn from this discussion between industry experts Jeff Janukowicz, Research Vice President at IDC; Brian Beeler, Owner and Editor In Chief, StorageReview.com; and Cameron T. Brett, SNIA STA Forum Chair.

This discussion will cover:

·      How are AI and machine learning affecting storage needs?

·      What is the state of the storage industry in 2024?

·      Security concerns being addressed in data storage.

·      EDSFF E1 and E3: should you make the switch?

·      Is SAS dead? What is the role of SAS in the future of storage?

·      How to make data storage sustainable for current and future need.

Hear about applications driving upcoming trends and learn about market data illustrating the assertions. This promises to be a lively session and you don’t want to miss it!

Storage Trends 2024

With the emergence of Conversational AI tools like Chat GPT and Google Bard, the world has been exposed to incredible new possibilities of technologies with the help of Large Language Models (LLM). A large language model is a type of artificial intelligence algorithm that uses deep learning techniques and massively large data sets accompanied with huge computation infrastructure. However, training LLMs is a complex task which requires substantial computational resources and infrastructure. Fine-tuning large language models (LLMs) for domain-specific data has emerged as a crucial technique to enhance their performance in specialized tasks and industries. In this talk we give an overview of the basic concepts of LLMs , their pre-training process, highlighting the transfer learning paradigm that forms the basis of fine-tuning. 

We will look into the preparatory steps required for successful fine-tuning, including dataset acquisition, cleaning, and structuring. Furthermore, we will discuss the workings of the fine-tuning process which involves adapting the pre-trained LLM’s parameters to domain-specific language patterns, contextual nuances, and task requirements. Architectural considerations, such as selecting appropriate model sizes, are explored in relation to the domain’s computational resources and target task complexity. We evaluate different fine-tuning approaches, ranging from traditional fine-tuning to more advanced techniques like adapter-based architectures. It covers techniques to prevent overfitting, including data augmentation, regularization, and transfer learning from related domains. Lastly, we will address the ethical scope of fine-tuning LLMs, highlighting potential challenges related to bias, fairness, and unintended consequences. They audience will gain an overall knowledge about LLM also they can know how to apply it on their specific data domains.

Fine-Tuning Large Language Models: Empowering AI for Specialized Applications

The latest buzz around generative artificial intelligence (AI) ignores the massive costs to run and power the technology. Without any guard rails in place, what are the impacts of AI on sustainability and costs across our technology resources? This webinar will offer insights on the potentially hidden technical and infrastructure costs associated with generative AI, best practices and potential solutions to be considered, discussing:   

• Scalability considerations for generative AI in enterprises 
• Significant computational requirements and cost for Large Language Model inferencing 
• Fabric requirements and costs 
• Sustainability impacts due to increased power consumption, heat dissipation, and cooling implications 
• AI infrastructure savings - On-prem vs. Cloud
• Practical steps to reduce impact, leveraging existing pre-trained models for specific market domains

Addressing the Hidden Costs of AI

Any discussion about storage systems is incomplete without the mention of Throughput, IOPs, and Latency. But what exactly do these terms mean and why are they important?
Collectively, these three terms are often referred to as storage performance metrics. Performance can be defined as the effectiveness of a storage system to address I/O needs of an application or workload. Different application workloads have different I/O patterns, and with that arises different bottlenecks, so there is no “one-size fits all” in storage systems. These storage performance metrics help with storage solution design and selection based on application/workload demands
 In this webinar, we’ll cover:
• What storage performance metrics mean – understanding key terminology nuances
• Why users/storage administrators should care about them
• How these metrics impact application performance 
• Real-world use cases

Everything You Wanted to Know About Throughput, IOPs, and Latency

Workloads using generative artificial intelligence trained on large language models are frequently throttled by insufficient resources (e.g., memory, storage, compute, or network dataflow bottlenecks). If not identified and addressed, these dataflow bottlenecks can constrain Gen AI application performance well below optimal levels. 

Given the compelling uses across natural language processing (NLP), video analytics, document resource development, image processing, image generation, and text generation, being able to run these workloads efficiently has become critical to many IT and industry segments. The resources that contribute to generative AI performance and efficiency include CPUs, DPUs, GPUs, FPGAs, plus memory and storage controllers.  

This webinar, with a broad cross-section of industry veterans, provides insight into the following:

• Defining the Gen AI dataflow bottlenecks
• Tools and methods for identifying acceleration options
• Matchmaking the right xPU solution to the target Gen AI workload(s)
• Optimizing the network to support acceleration options
• Moving data closer to processing, or processing closer to data
• The role of the software stack in determining Gen AI performance

Accelerating Generative AI – Options for Conquering the Dataflow Bottlenecks

Artificial Intelligence

Are you an IT service management professional interested in developing your knowledge and improving your job performance? Join the IT service management community to access the latest updates from industry experts. Learn and share insights related to IT service management (ITSM) including topics such as the service desk, service catalog, problem and incident management, ITIL v4 and more. Engage with industry experts on current best practices and participate in active discussions that address the needs and challenges of the ITSM community.

IT Service Management

Cloud computing has exploded over the past few years, delivering a previously unimagined level of workplace mobility and flexibility. The cloud computing community on BrightTALK is made up of thousands of engaged professionals learning from the latest cloud computing research and resources. Join the community to expand your cloud computing knowledge and have your questions answered in live sessions with industry experts and vendor representatives.

Cloud Computing

The IT security community on BrightTALK is composed of more than 200,000 IT security professionals trading relevant information on software assurance, network security and mobile security. Join the conversation by watching on-demand and live information security webinars and asking questions of experts and industry leaders.

IT Security

The data center management community focuses on the holistic management and optimization of the data center. From technologies such as virtualization and cloud computing to data center design, colocation, energy efficiency and monitoring, the BrightTALK data center management community provides the most up-to-date and engaging content from industry experts to better your infrastructure and operations. Engage with a community of your peers and industry experts by asking questions, rating presentations and participating in polls during webinars, all while you gain insight that will help you transform your infrastructure into a next generation data center.

Data Center Management

The unexpected nature of natural disasters and other disruptive events means that preparation is key to managing disaster recovery and business continuity planning. Join the business continuity and disaster recovery community for live and recorded presentations discussing best practices for business continuity planning, disaster recovery programs and business continuity management. Learn from BCDR experts to develop your critical infrastructure and gain insight into tactical solutions to your BCDR issues.

Business Continuity / Disaster Recovery

The storage community on BrightTALK is made up of thousands of storage and IT professionals. Find relevant webinars and videos on storage architecture, cloud storage, storage virtualization and more presented by recognized thought leaders. Join the conversation by participating in live webinars and round table discussions.

Storage

As an IT professional, many of the problems you face are multifaceted, complex and don’t lend themselves to simple solutions. The information technology community features useful and free information technology resources. Join to browse thousands of videos and webinars on ITIL best practices, IT security strategy and more presented by leading CTOs, CIOs and other technology experts.

Accelerating Generative AI – Options for Conquering the Dataflow Bottlenecks

Presented by

About this talk

More from this channel