Name: Prototype and Deploy LLM Applications on Intel® NPUs
Start: 2024-08-29T16:24:00Z
End: 2024-08-29T16:25:00.000Z
Location: BrightTALK
Rating: 0

As we adapt to the rapidly growing demand for AI applications, significant challenges are bound to arise. Streamlining the process of machine learning, complying with regulatory frameworks worldwide, ensuring security and privacy, and controlling cloud computing costs have become increasingly crucial priorities. Your work is instrumental in overcoming these challenges.

The tools you use matter, and the foundation you build on matters even more. With the Intel® Tiber™ portfolio, we’re partnering with our customers to harness the power of AI and other cutting-edge technologies to move the world forward, embrace limitless opportunities, and build a resilient future for all.

Intel, the Intel logo and Intel Tiber are trademarks of Intel Corporation or its subsidiaries​.

IT hardware supply chain risks affect everyone from hardware system manufacturers to enterprises managing their hardware fleets. Ensuring the integrity of the hardware supply chain is essential, both to mitigate risk as well as for meeting compliance and contractual obligations. 

Join us in this webinar featuring Todd Thiemann, Senior Analyst at Enterprise Strategy Group and Yasir Aziz, Senior Director of Trust & Security Solutions at Intel as they discuss how your organization can secure your hardware supply chain.

The session will cover: 
• Why supply chain risks are increasingly a cause for concern (with ESG reporting a +34% increase in research on supply chain security)
• Supply chain regulations and laws around the globe, including new regulations in Japan, UK, US and the EU
• Key functionalities that enterprises are looking when it comes hardware standards, including visibility, compliance and lifecycle management
• What Intel is doing from a supply chain perspective to help you and your enterprise

Securing the Hardware Supply Chain: Insights from Intel and Enterprise Strategy Group

The AI revolution is here, transforming industries at an unprecedented pace. For CEOs and business leaders, understanding how to leverage this game-changing technology is no longer optional—it's critical for staying ahead.

Join Markus Flierl, Corporate VP of Intel® Tiber™ AI Cloud, Ted Shelton, Chief Operating Officer of Inflection AI, and moderator Ryan Carson, Intel AI for a dynamic discussion on how executives can harness AI to drive innovation and growth. This webinar will explore:

• Preparing for the Intelligence Wave: Key actions executives must take now to avoid falling behind.
• Technical Foundations of AI Productivity: Insights into the server-side technologies enabling massive leaps in knowledge and efficiency.
• Future Applications in the Workplace: Real-world examples of how employees leverage AI with private company data and documents.
• The UX of Enterprise AI: Designing intuitive AI experiences that empower teams and deliver tangible results.

Whether you're navigating early adoption or scaling existing AI capabilities, this session will provide actionable strategies and practical insights to unlock AI's full potential in your organization.

Don’t miss this opportunity to stay ahead of the curve—register here!

AI for CEOs: Unlocking Competitive Advantage in the Age of Intelligence

Join us for an upcoming webinar to learn how you can enhance data security and trustworthy AI with Intel® Tiber™ Trust Services and Noname Security. In today's digital age, enterprise organizations are adopting multi-cloud and hybrid cloud services at an unprecedented rate, making it crucial to protect sensitive data in use, as well as data at rest and in transit. That's where Intel Tiber Trust Services and Noname comes in, offering a revolutionary approach to data security that extends zero trust to the workload providing higher confidence and stronger end-to-end protection for your data in the cloud.

Secured Workloads: Intel® Tiber™ Trust Services and Noname Security

As AI and Generative AI (GenAI) continue to drive innovation, ensuring trust in these systems has become critical for organizations. Join Mike Leone, Principal Analyst at Enterprise Strategy Group, Markus Flierl, Corporate VP of Intel Developer Cloud, and Rob Clark, President and CTO of Seekr, as they discuss how to create a solid foundation for trusted AI. 

This webinar will explore the importance of having the right infrastructure to support AI workloads, the growing role of responsible AI, and strategies to balance innovation with transparency and accountability.

Join this session to learn about:
• Trusted AI as a Business Imperative: Understand why trust is becoming central to AI discussions and how businesses are prioritizing transparency, security, and fairness in AI systems
• Overcoming Key Challenges: Learn about the major technical challenges—skills gaps, data quality, infrastructure requirements—and how trust underpins many of them
• Outcome-Based AI: Discover why organizations are shifting from AI for innovation's sake to outcome-based AI, ensuring measurable results that align with business and societal goals
• Real-World Applications: Learn from Intel and Seekr's partnership on trusted AI, including customer examples and real-world success stories using their combined solutions

Building Trust in AI: Infrastructure, Innovation, and Responsible AI Practices

Building efficient and scalable end-to-end AI applications is complex and often comes with a steep learning curve due to the many tools, libraries, and optimization methods required.

This session introduces a solution: five turnkey, downloadable AI reference kits tailor-made to solve business problems across a variety of industries, delivering higher accuracy and better performance while decreasing development cycles. Each is built with Intel-designed AI workflows and optimized tools, frameworks, and libraries.

This video shows:

- An overview of the use cases: predictive asset maintenance, credit card fraud detection, disease prediction, correspondence indexing, and anomaly detection.
- How to use the kits to jumpstart development of your AI applications, including customizing them for your specific needs.
- How to run them with Docker* containers, bare metal, or Argo Workflows on Kubernetes* using the Helm* package manager.

Skill level: Novice

Optimize AI Workloads: Five Use Cases to Reduce the Learning Curve

Speed up tasks such as data preprocessing at scale, training, and inference while also gaining performance.

This session explores how two Intel® Tools can be used as drop-in replacements for stock pandas and scikit-learn* libraries to significantly speed up key tasks in machine learning model development and deployment on CPUs instead of GPUs.

- Intel® Distribution of Modin* enables data scientists to scale to distributed DataFrame processing without having to change API code.
- Intel® Extension of Scikit-learn* seamlessly accelerates scikit-learn applications for Intel CPUs and GPUs across single- and multi-node configurations.

This session covers an overview of these tools including:
- An overview of the tools—Intel Distribution of Modin and Intel Extension for Scikit-learn—and what you can do with them
- How to use Intel Distribution of Modin as a drop-in replacement for stock pandas
- How to use the scikit-learn extension as a drop-in replacement for stock scikit-learn libraries
- A live demo showcasing the performance improvements.

Skill level: Novice

Speed Up Machine Learning Training on CPUs with AI Tools

If you’ve ever considered partnering with Intel or wondered what that might be like, this session is for you.

Find out how a collaboration between Bilic and Intel has propelled the AI startup’s business progression and development, setting it up for increased scale, innovation, and market excellence.

First, you’ll get a detailed overview of the Intel and Bilic alliance, wherein the AI-based cybersecurity startup significantly transformed its business by taking advantage of Intel expertise, entrepreneurial opportunities, and hardware and software technologies.

Second, Bilic will showcase its innovative BilicVerify solution, built using the advanced AI software and hardware available in Intel® Tiber™ Developer Cloud1.

Topics include:

- A breakdown of the Bilic journey with Intel and how it reshaped the startup's LLM-driven solutions and services, boosting LLM performance and accuracy, cost efficiencies, and the customer experience.
- How use of Intel Tiber Developer Cloud—AI models, hardware, and compute power—has significantly improved customer approvals and onboarding processes for the BilicVerify core offering: helping companies run reliable, easy, secure, and instant background checks to make informed hiring decisions.
- A live demo of the startup's advanced AI-based solution, including best practices for using the full features, capabilities, and processing power of Intel Tiber Developer Cloud.

Skill level: All

Transform an AI Business into a Market Differentiator with Intel

According to a myriad of trusted tech-focused websites, Intel® Arc™ GPUs offer great price-to-performance value for game development, media creation, and, yes, generative AI.

This session focuses on how to implement high-performing generative AI (GenAI) applications on these budget-friendly GPUs using Stable Diffusion*, LlaMA 3 quantization, and Intel-optimized extensions for PyTorch* and Transformers. Watch a demo on building a chatbot using the retrieval augmented generation (RAG) technique.

This session shows how to:

Deploy Stable Diffusion and large language models (LLM) on Intel Arc GPUs.
Set up your own GenAI application with a few lines of code.
Quantize LlaMA 3 with int4 through Intel AI-optimized software.
Build a chatbot with a RAG engine.

Skill level: Novice

Run Your GenAI Programs on Intel® Arc™ GPUs

Retrieval augmented generation (RAG) is a technique for improving the accuracy of large language models (LLM) by combining information retrieval from proprietary data sources with text generation.

This session explores how to implement the RAG architecture, a powerful framework for the RAG technique that integrates the strengths of open LLMs—such as Llama 3—and vector databases to improve contextual relevance and accuracy of information retrieval.

This session shows how to:

Use open source models and tools to create a robust retrieval system that can efficiently process and generate natural language responses.
Set up a vector database to efficiently store and retrieve embeddings.
Integrate RAG systems into existing infrastructure, including the hardware and software requirements for deploying these advanced models.
Apply best practices for fine-tuning and customizing RAG models and vector databases for specific industry use cases through demos of practical applications.

Skill level: Intermediate

Implement RAG Architectures for Enhanced Information Retrieval

What is an AI PC and how do developers exploit its AI acceleration capabilities across its included CPU, GPU, and NPU?

This session delivers answers and unpacks fundamental and advanced techniques for tapping into the ever-expanding potential of AI using the OpenVINO™ toolkit on the AI PC.

Watch a discussion on:

An overview of an AI PC.
Approaches for making current-state and next-generation AI and generative AI (GenAI) models more performant and power efficient.
Reasons to consider running AI and GenAI models on client and edge devices and tips on low-power implementations.
Techniques for optimizing and deploying AI applications on different AI PC compute engines using the OpenVINO toolkit.
How to access and use the OpenVINO™ notebook repository with an overview of its functionalities and applications.

Demos showcase how to seamlessly transition AI and GenAI apps across compute engines using OpenVINO toolkit for popular use cases, such as background blurring on video calls, object detection, and GenAI-powered image generation and chatbots.

Skill level: All

Build Next-Gen, Portable, Power-Efficient AI on an AI PC

Large language models (LLM) promise to revolutionize how enterprises operate, but making them production-ready means solving privacy risks, security vulnerabilities, and performance bottlenecks.

Not so easy.

This session focuses on how AI startup Prediction Guard found a solution to these challenges by using the processing power of Intel® Gaudi® 2 AI accelerators in the Intel® Tiber™ Developer Cloud.1 The topics include:

- Prediction Guard’s pioneering work with hosting open source LLMs like Llama 2 and neural-chat-7B in a secure, privacy-preserving environment with filters for PII, prompt-injection attacks, toxic outputs, and factual inconsistencies.
- How Prediction Guard optimized batching, model replication, tensor shaping, and hyperparameters for 2x throughput gains and industry-leading time to first token for streaming.
- Architectural insights and best practices for capitalizing on LLMs.

How Prediction Guard Delivers Trustworthy AI on Intel® Gaudi® 2 AI Accelerators

Seekr*, whose customers include Moderna*, SimpliSafe*, Babbel*, Constant Contact*, and Indeed*, moved its production workloads from on-premises to Intel® Tiber™ Developer Cloud, thereby meeting the enormous demand for compute capacity at a fraction of the cost and with exceptionally high workload performance.

Results: 2x inference volume throughput, 20% speed increase, and 50% faster inference.

"For our business to grow, we quickly realized we needed a reliable compute partner that could address the massive compute requirements for Seekr’s foundation AI models. After extensive testing and benchmarking across different platforms, we chose to partner with Intel® Developer Cloud."

– Rob Clark, president and CTO, Seekr

Seekr* Grows Trustworthy LLM Business with Intel® Tiber™ Developer Cloud

The most popular cloud service providers (CSP) and Intel have long partnered to deliver state-of-the-art, cloud-based solutions spanning AI, HPC, confidential computing, and more for individuals and enterprises alike.

This session focuses on AI solutions.

Learn how to enable and use Intel®-optimized deep learning and classical machine learning frameworks and libraries in Amazon Web Services (AWS)*, Google Cloud Platform* service, Microsoft Azure*, and Intel® Developer Cloud.

This session covers:

An introduction to Intel’s free, open source AI software development tools
How to enable these tools in the various cloud platforms
An overview of the available Intel hardware available in the CSPs

How to Use Intel®-Optimized AI Software in the Cloud

There are limited heterogeneous computing opportunities for Python* developers. Data Parallel Extensions for Python* language addresses this issue by bringing the power of SYCL* to Python users. The extensions extend numerical Python capabilities beyond CPUs, enabling high-performance gains on data parallel devices like GPUs.

This session walks you through how to use the extensions, ultimately enabling you to offload Python data and workloads to any SYCL device, such as GPUs, with little code effort.

This session shows how to:

Use the extensions for open source heterogeneous computing and compilation.
Write SYCL kernels in Python.
Use a just-in-time (JIT) compilation in Python on any SYCL device for near-native performance
Achieve data interoperability and scale via powerful drop-in replacements for NumPy and Numba*.
The session includes technical demos that showcase the Data Parallel Extensions for Python language in action, including the speedups at every step.

Open Source Heterogeneous Programming for Python* Developers

Granulate - No code changes, great performance improvements and cloud cost reduction. 

Cut infrastructure costs and improve application performance with Autonomous Continuous Workload Optimization - with 0 code changes. Learn more in this session.

What is Granulate? Intel Software

This live session covers:
- How Amazon handled Prime Day
- What you can learn for Black Friday

And more.

Top Strategies to Survive Customer Surges

Model size plus limited hardware resources in client devices (for example, disk, RAM, or CPU) make it increasingly challenging to deploy large language models (LLM) on laptops compared to cloud-based solutions. The AI PC from Intel solves this issue by including a CPU, GPU, and NPU on one device.

This session focuses on the NPU and showcases how to prototype and deploy LLM applications locally. It also includes:

- How NPU architecture works, including features, advantages, and capabilities in accelerating neural network computations on Intel® Core™ Ultra processors (the backbone of AI PCs from Intel).
- Practical aspects of deploying performant LLM apps on Intel NPUs—from initial setup to optimization and system partitioning—using the OpenVINO™ toolkit and its NPU plug-in.
- What LLMs are, and advantages versus challenges of local inference.
- Fast LLM prototyping on Intel Core Ultra processors using the Intel® NPU Acceleration Library.

Get real-world examples and case studies (like chatbots and retrieval augmented generation [RAG]) that showcase the seamless integration of LLM applications with NPUs, including how this synergy can unlock performance and efficiency.

Skill level: All

Prototype and Deploy LLM Applications on Intel® NPUs

Artificial Intelligence

AIOps

Generative AI

Developers

Cloud computing has exploded over the past few years, delivering a previously unimagined level of workplace mobility and flexibility. The cloud computing community on BrightTALK is made up of thousands of engaged professionals learning from the latest cloud computing research and resources. Join the community to expand your cloud computing knowledge and have your questions answered in live sessions with industry experts and vendor representatives.

Cloud Computing

As an IT professional, many of the problems you face are multifaceted, complex and don’t lend themselves to simple solutions. The information technology community features useful and free information technology resources. Join to browse thousands of videos and webinars on ITIL best practices, IT security strategy and more presented by leading CTOs, CIOs and other technology experts.

Prototype and Deploy LLM Applications on Intel® NPUs

Presented by

About this talk

More from this channel