On-Demand
Beyond 'Thinking' Parallel: How to Choose the Right Parallel Programming Model
Learn how to pick the right parallel programming method, and why a combination is likely to be the most effective approach. Many parallelism models have been proposed, but only a few will succeed. In addition to discussing how to identify the winners, James Reinders proposes that there are three classes of parallel programming abstractions, and why it is perfectly reasonable to use all three in the same program code. A 30-minute overview will provide insight into and options for tackling parallelism through these models, followed by a 20-minute Q&A.
Speaker: James Reinders, Director and Chief Evangelist, Intel® Software Development Products
James Reinders is leading the development efforts at Intel for performance-optimizing software tools, including compilers, libraries, and threading analysis products. He joined Intel Corporation in 1989 as a senior engineer and since that time has contributed to a range of projects, including the world's first TeraFLOP supercomputer (ASCI Red), compilers, and architecture work for a number of Intel processors and parallel systems.
In addition to serving as a monthly columnist for "The Gauntlet" on GoParallel.com, Reinders has authored two books, VTune™ Performance Analyzer Essentials and the recent O'Reilly nutshell title, Intel® Threading Building Blocks (TBBs).
On-Demand
Real World Parallelism: Refactoring Legacy Code and Implementing Concurrency - presented by Cadence Design Systems
Cadence Allegro's complex Design Rules Checking (DCR) process is used to verify that designs meet constraint requirements. Development is currently underway to improve the performance of the DRC process using multithreading. View the design architecture and learn about the challenges faced in refactoring the legacy code, achieving platform independence, and performance verification.
Speaker: John Schiavone, Senior Member of Consulting Staff, Cadence Design Systems
John Schiavone is Senior Member of Consulting Staff at Cadence Design Systems, where he has worked for eight years in Allegro® R&D. His Allegro experience includes mechanical interfaces (DXF, IDF, Stream), reports, GUI design, design partitioning, and physical and spacing constraints. Currently, he is introducing multithreading into the Allegro Design Rules Checking process.
Schiavone has 20 years of electronics design experience with tenures at Raytheon, Loral, and Lockheed Martin. His design experience includes circuit-board design, packaging design, and microelectronics design. His manufacturing experience comprises all aspects of electronics assembly.
On-Demand
Visual Effects for Animation - presented by DreamWorks Animation
In this webinar, Ron Henderson will show examples of visual effects, from hair and feathers to smoke and fire, from a variety of DreamWorks Animation feature films. He will discuss in general terms the kinds of techniques used to achieve particular visual effects. Finally, Henderson will show a detailed breakdown of the dam-breaking scene from Madagascar: Escape 2 Africa, demonstrating how different elements of key frame animation, simulation, and rendering are combined in a real production shot.
Speaker: Ron Henderson, DreamWorks Animation
Ron Henderson manages the FX Tools group at DreamWorks Animation, where he is responsible for developing physical simulation and procedural modeling tools. These systems have been used for key visual effects in recent films such as Kung Fu Panda and Monsters vs. Aliens (March 2009).
Prior to joining DreamWorks in 2002 he was a senior scientist at Caltech with a joint appointment to the Applied Math and Aeronautics departments, where he worked on efficient techniques for the direct numerical simulation of fluid turbulence.
Dec. 1
A Quick and Easy Way to Parallelize a Legacy Codebase with Intel® Threading Building Blocks (TBBs) - presented by Avid
Learn how to overcome the limitations of a thread-based scheduler, including dealing with the absence of recursive parallelism support and the inefficient handling of unbalanced processing load. Bernard Laberge addresses how Avid resolved the expensive refactoring of their thread-based scheduler into a task-based solution by choosing Intel® Threading Building Blocks (TBBs). He explores how Avid was able to easily integrate the Intel TBBs into their video editor applications and more than 5 million lines of code.
Speaker: Bernard Laberge, Senior Principal Engineer, Avid
Bernard Laberge is a senior principal engineer in the video editors division at Avid. During his seven years with the company he has been actively involved in the replacement of the legacy video processing engines used by Avid editors with a common hardware-abstracted, component-based video processing engine currently running on the CPU with SIMD optimized code, GPU, and dedicated hardware.
Dec. 15
How to Use Intel® Parallel Studio to Streamline Code Development in a Multicore Environment - presented by SIMULIA
Resolve elusive, costly multithreading errors quickly and efficiently with Intel® Parallel Studio. While many coding problems that lead to bugs in software applications are typically straightforward logic errors, errors in managing memory and in multithreading code can sometimes take weeks to months to diagnose and fix. Matt Dunbar explores how and why taking advantage of multicore processors through multithreaded code is critical for compute-intensive applications. While spotlighting his work on SIMULIA's Abaqus finite element solver, Dunbar addresses the need for multicore execution and shares his experiences using Intel Parallel Studio to streamline code development in a multicore environment.
Speaker: Matt Dunbar, Director for Performance Technology, SIMULIA
Matt Dunbar is the director for performance technology at SIMULIA. Since joining the company in 1993, he has worked on parallelization of the Abaqus suite of products, initially for shared memory architectures and more recently for distributed memory architectures. Dunbar has also been intimately involved in selecting both the hardware and software tools used in the development of the Abaqus product line.
On-Demand
The Key to Scaling Applications for Multicore
Whether an application is serial, partially parallel, or fully parallel it can get significant benefit from parallelism. New Intel® Parallel Studio tools provide Windows* developers with the keys to get the most out of parallelism. Gain an in-depth understanding of when, where, and how much to use parallelism to achieve optimal results. Microsoft* Visual Studio C/C++ developers will learn how to identify and safely design applications that can scale with increasing processor core counts. Recommended companion technical webinar: Identify and Address Threading Opportunities.
Speakers: Paul Petersen, Senior Principal Engineer, Intel Corporation and Mark Davis, Senior Principal Engineer, Performance, Analysis, and Threading Lab, Intel® Software Development Products
Paul is based in Champaign, IL, having joined Intel through the acquisition of Kuck and Associates, Inc (KAI) in 2000. He has been active in the field of languages and tools for parallel computing. Paul was a major contributor to the creation of the OpenMP* specification (www.openmp.org), along with tools for the analysis of threaded applications including Assure, Assure for Java, Intel Thread Checker, and most recently the Intel Parallel Studio. He earned a B.S. in computer science from the University of Nebraska and a Master of Science and Ph.D. in computer science from the University of Illinois, Urbana-Champaign.
Mark has worked as an architect, in PAT and in the Emerging Product Lab, on tools to help users discover parallelism in their programs. Previously he held various positions in the Itanium Compiler Lab, including co-manager, architect, and co-manager of the Code Generator team. Mark specialized in compiler optimizations, performance analysis, parallelization, and architecture design in his earlier career at Digital and Compaq, Stardent, and Intermetrics. He holds a Ph.D. in Computer Science from Harvard.
On-Demand
Image Processing: Stop Developing Code from Scratch


There is a better way to develop code for images than writing it from scratch. Now, using new Intel® Parallel Studio products, developers can efficiently transform image processing for improved productivity and performance. Integrated with Microsoft Visual Studio* for C/C++, Intel® Parallel Composer, Intel® Parallel Amplifier, and Intel® Parallel Inspector enable developers to implement and optimize images with parallelism. Parallel development techniques, such as harmonization or Sobel filters in Intel® Integrated Performance Primitives (IPP), and OpenMP* at the primitive function level, will be used to demonstrate how to enhance image processing for multicore. Starting at a high level with a non-threaded application, Parallel Amplifier will locate hotspots within the application. As threads are added at a higher level with OpenMP, Parallel Inspector quickly finds and fixes threading errors. Implementing parallelism using Parallel Studio provides forward-scaling, saving developers from rewriting code with each new processor innovation.
Speaker: Walt Shands, Technical Consulting Engineer, Intel® Software Developer Products
Walt works at Intel on software developer tools. He has bachelors and masters degrees in computer science from Columbia University.
On-Demand
Go-Parallelism! Ease the Onramp for C/C++ Windows* Development
Leveraging 25 years creating effective tools for parallel application development, Intel now brings parallelism to the full range of Windows applications. Intel Parallel Studio gives Microsoft Visual Studio* C/C++ developers the tools they need to discover, build, debug, and optimize for multicore. Hear James Reinders, Intel's chief software evangelist, introduce the newest line of development tools for the end-to-end development cycle. There is no better time to tackle parallelism or better tools to assist in the development process. Recommended companion technical webinar: Solve Parallelism with Intel Parallel Studio.
Speaker: James Reinders, Director and Chief Evangelist, Intel® Software Development Products
Reinders is leading the development efforts for performance optimizing software tools including compilers, libraries, and threading analysis products. In 1989, Reinders joined Intel Corporation as a senior engineer and has contributed to projects including the world's first TeraFLOP supercomputer (ASCI Red), compilers, and architecture work for a number of Intel processors and parallel systems. Reinders is the author of the latest O'Reilly Nutshell book "Intel® Threading Building Blocks," a monthly columnist for the "The Gauntlet," found online at go-parallel.com, and the author of the book "VTune™ Performance Analyzer Essentials."
On-Demand
Simplify Parallelism with Intel® Parallel Composer
Parallelism expert Joe Wolf guides developers through a comprehensive tour of the new Intel® Parallel Composer. One of the advanced products in Intel® Parallel Studio, Parallel Composer will be used to showcase how OpenMP* 3.0 and Intel® C++ language extensions work in parallel applications. See how Lambda functions, threaded libraries, Intel® Threading Building Blocks, and Valarray enabled with Intel® Integrated Performance Primitives provide effective tools to introduce, enhance, and optimize threaded applications for multicore. Whether new to parallelism or an expert, every C/C++ Microsoft* Visual Studio developer can benefit from this tour. Recommended companion technical webinars: Parallel Implementation Methods with Intel Parallel Composer, and Simplifying Parallel Implementation with Intel Threading Building Blocks.
Speaker: Joe Wolf, Manager, Intel® Compilers Technical Consulting and Support Team, Intel® Software Development Products
Joe Wolf has worked with compilers and tools for applications from supercomputers to the desktop for over twenty years as a developer, in technical support and training for Intel's software tools, and now as a manager of the Intel® Compilers Technical Consulting and Support Team. He has specialized in vectorizing and parallelizing compilers, as well as multithreading.
On-Demand
Debugging Parallel Code for Fast, Reliable Applications
Memory errors, data races, and deadlock are notorious yet critical issues to track down in threaded apps. Learn new techniques using Intel® Parallel Studio developer tools and save hours of debugging time, while improving application reliability. Intel® Parallel Inspector offers unique threading analysis techniques, drilling down to source code lines where problems can occur, and enabling developers to locate and isolate common threading problems. Learn how to use Parallel Inspector to find memory leaks and common memory overruns. Tap into debugging extension plug-ins and use error checking capabilities found in Parallel Studio to improve application reliability and performance. Recommended companion technical webinars: Find Errors in Windows C++ Applications, and Static Analysis and Intel® C++ Compilers.
Speaker: Jay DeSouza, Intel® Software Development Products
Over the last four years at Intel, DeSouza's roles have included new product R&D, product lifecycle management, and strategic customer engagement. He has substantially improved production code performance for several Intel customers through use of Intel® Software Tools. DeSouza received his Ph.D. from the University of Illinois at Urbana-Champaign, where he specialized in parallel programming.
On-Demand
Easy Ways to Solve Parallel Performance Challenges
New innovations bring new challenges. For many C/C++ developers, introducing parallelism means spending hours tuning an application for multicore performance. Learn techniques with a new performance tuning profiler found in Intel® Parallel Studio and quickly identify performance issues. Using application source code, Intel parallelism expert Gary Carleton demonstrates how developers can quickly solve the three most common performance issues: (1) bottlenecks, (2) locks and waits, and (3) amount and locations of threads. Windows* developers now have a tool that brings new levels of transparency for quickly and accurately tuning threaded applications for optimal performance. Recommended companion technical webinar: The Good, the Bad, and the Ugly: Improve Parallel Application Quality and Performance.
Speaker: Gary Carleton, Senior Staff Software Engineer, Performance Analysis and Threading Lab, Intel® Corporation
Gary Carleton is a Senior Staff Software Engineer in the Performance Analysis Tools group at Intel Corp. He has been at Intel for 22 years and currently works on SW performance tools including the VTune™ Performance Analyzer. He has been an engineering manager and software engineer for Intel Corp, Cadre Technologies and Kaiser Engineers. He has a BS in Electrical Engineering and Computer Sciences from the University of California at Berkeley.
On-Demand
How Fast Is Fast: Measuring and Understanding Parallel Performance
In this seminar, Intel will show examples of critical parallelism development functions including accessing various system timers; the performance of various locks in Intel® Threading Building Blocks (TBB) and on various hardware architectures; the pointlessness of optimizing applications that are not on a critical path; and measurement mistakes such as unsynchronized clocks, adjusting for GHz differences, and Amdahl limitations on speedup.
Speakers: Dr. Tim Mattson, Principle Engineer, Intel Application Research Laboratory; Dr. Jayant DeSouza, Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
Over the last four years at Intel, DeSouza's roles have included new product R&D, product lifecycle management, and strategic customer engagement. He has substantially improved production code performance for several Intel customers through use of Intel® Software Tools. DeSouza received his Ph.D. from the University of Illinois at Urbana-Champaign, where he specialized in parallel programming.
On-Demand
A Practical Threading Methodology
While threading can be a challenge, new software development tools help simplify the process by identifying thread correctness issues and performance opportunities. We will present a case study on threading a Black Scholes application—a common application in the financial services industry—to improve application performance and responsiveness. We will then look at a methodology used to successfully thread applications and valuable tools that support this threading methodology.
Speakers: John O'Neill, Engineer, Intel Developer Products Division; Shwetha Doss, Technical Consulting Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
O'Neill currently works on the Intel® C++ Compiler. He has been with for Intel for eight years as a compiler technical consultant, presented talks at numerous conferences, and written many articles and white papers. O'Neill works closely with companies in financial services and digital media, and enterprise ISVs, helping them take advantage of Intel® software and hardware technologies. Before joining Intel, John was a research associate at the University of Minnesota, conducting research in elementary particle physics and developing large scale scientific applications. O'Neill has published articles in refereed journals and books, and holds a Ph.D. in physics from the University at Albany, State University of New York.
Doss supports the Intel® Threading Analysis Tools and consults with strategic customers. She has also written papers and presented talks about the tools.
On-Demand
Fundamental Topics in the Design of Parallel Programs
This seminar introduces several topics all developers of parallel code should know and be familiar with, such as parallel algorithm design, data and functional decomposition, various synchronization primitives, standard scheduling mechanisms, work stealing, cache coherence, and false sharing.
Speaker: Levent Akyil, Software Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
Akyil has been with Intel for seven years, and held positions in the Digital Enterprise and Software and Solutions Groups. He worked on various Intel® Itanium®-based platforms and enterprise MP server projects before moving to his current role, where he provides technical consulting support for Intel® Software Developer Products. Akyil works with internal and external strategic customers providing enabling and optimization support on Intel® platforms. He has an M.S. in computer science and an MBA in technology and innovation management.
On-Demand
Optimizing Parallel Programs: Symptoms to Solutions
A systematic approach to symptom analysis can help solve multi-core performance and correctness issues. We will cover a set of symptoms and their possible causes. These include slowdown despite adding cores or increasing the number of threads, slowdown when giving threads larger pieces of work, slowdown due to scheduling issues, and slowdown/errors due to synchronization, etc.
Speaker: Dr. Jayant DeSouza, Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
Over the last four years at Intel, DeSouza's roles have included new product R&D, product lifecycle management, and strategic customer engagement. He has substantially improved production code performance for several Intel customers through use of Intel® Software Tools. DeSouza received his Ph.D. from the University of Illinois at Urbana-Champaign, where he specialized in parallel programming.
On-Demand
Boosting Performance of Imaging Solutions by Adopting New Deferred Mode Image Processing (DMIP) Layer
A typical image processing task handles data type conversion, filtering or threshold, one after another, and applies Intel® Integrated Performance Primitives (Intel® IPP) libraries without considering the order of calculations. With demands for dealing with larger images in complex image processing tasks, additional improvements for pipelined operations on images are more important. These improvements support better utilization in memory optimization and much faster performance on multi-threaded environments. We'll introduce a new implementation Deferred Mode Image Processing (DMIP) layer built on top of Intel IPP. We'll also share the latest performance benchmarks comparison for typical image processing tasks used in imaging solutions in medical and multimedia applications.
Speaker: Ying Song, Technical Consulting Engineer for Intel® Integrated Performance Primitives, Intel Software Solutions Group
Song is responsible for consulting with applications developers on their use of Intel Integrated Performance Primitives libraries. She has worked for Intel for 11 years.
On-Demand
Future Parallelization Technologies?
Whatif.intel.com hosts Intel's latest prototype products that explore new parallelization technologies, as well as community forums to discuss these ideas with other software technologists. Find out what Whatif.intel.com has to offer.
Speaker: Ganesh Rao, Intel Developer Products Division
Ganesh has more than 15 years of experience in the areas of application tuning, benchmarking, and developer support. Ganesh has a broad array of applications experience, including computer games, enterprise applications and high performance computing environments. Currently Ganesh is focusing on training and supporting customers with development tools and benchmarks.
On-Demand
The Concurrency Revolution
Although driven by the industry-wide shift to multi-core hardware architectures, concurrency is primarily a software revolution. We are now seeing the initial stages of the next major change in software development, as over the next few years the software industry brings concurrency pervasively into mainstream software development, just as it has done in the past for objects, garbage collection, generics, and other technologies. Sutter summarizes the issues involved, gives an overview of the impact, and describes what to expect over the coming decade.
Speaker: Herb Sutter, Software Architect, Microsoft
Sutter is a chair of the ISO C++ standards committee. Among his books and papers is the widely-cited article "The Free Lunch Is Over" in which he coined the phrase "concurrency revolution" to describe the software sea change now in progress to exploit increasingly parallel hardware.
On-Demand
Steps to Parallelism NOW
Practical advice on how to put parallelism in your programs today. Reinders will discuss how to pick the best approach and avoid common pitfalls, and will share his favorite rules of thumb on how to succeed with parallel programming. Reinders will set the context for the rest of the series, and explain how you can approach parallelism now in your applications.
Speaker: James Reinders, Chief Evangelist of Intel Software Products
Reinders is leading the development efforts for performance optimizing software tools including compilers, libraries, and threading analysis products. In 1989, Reinders joined Intel Corporation as a senior engineer and has contributed to projects including the world's first TeraFLOP supercomputer (ASCI Red), compilers, and architecture work for a number of Intel processors and parallel systems. Reinders is the author of the latest O'Reilly Nutshell book "Intel® Threading Building Blocks," a monthly columnist for the "The Gauntlet," found online at go-parallel.com, and the author of the book "VTune™ Performance Analyzer Essentials."
On-Demand
Threading for Performance
Doss will discuss some common performance issues specific to multithreaded applications and how these can be identified with analysis tools such as the VTune Performance Analyzer and Intel Thread Profiler and addressed with Intel® Threading Building Blocks (TBB). Intel TBB, a C++ template-based runtime library, consists of inherently scalable parallel algorithms and data structures that scale applications automatically as more processors and cores are detected in the underlying hardware. Intel TBB also helps address and fix performance issues that result from load imbalance and memory allocation.
Speaker: Shwetha Doss, Technical Consulting Engineer at the Intel Performance Analysys and Threading Lab
Doss' current role involves supporting the Intel® Threading Analysis Tools and consulting with strategic customers. She has also written papers and given talks about Intel® Threading Analysis tools.
On-Demand
Parallelism Programming Has Gone Mainstream: Are You Ready?

Tim Mattson will begin by discussing the trends driving parallel computing into the mainstream; with multi-core processors now and processors composed of many, heterogeneous cores in the future. He will show how Intel is uniquely positioned to help software developers make the transition from sequential to parallel software. Parallelizing software can be complex and for some, this will be a difficult transition. For those who are well prepared, however, parallel computing is an opportunity to get a jump on the competition. Tim will close with a quick look at algorithms used in parallel computing and some domains where it has proven particularly valuable. James Reinders will then move from theory to practice. He will discuss the methodologies, techniques, and tools Intel has in place to help software developers' transition to parallel computing. Finally James will introduce our spring webinar series where Intel's world-class programmers will do a deep dive on different facets of parallel computing and show what's new in the world of Intel's developer and threading tools.
Speakers: Dr. Tim Mattson, Principle Engineer, Intel Application Research Laboratory; James Reinders, Chief Evangelist of Intel Software Products
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
James Reinders is a senior engineer who joined Intel Corporation in 1989 and has contributed to projects including the world's first TeraFLOP supercomputer (ASCI Red), compilers and architecture work for the iWarp, Pentium® Pro, Pentium II, Itanium®, and Pentium® 4 processors. Reinders is currently the director of business development and marketing for Intel's Software Development Products and serves as the chief evangelist and spokesperson. He has been a leader in the creation of Intel's Software Products including product plans, support, technical marketing, marketing and business development. Reinders is also the editorial columnist for the monthly "The Gauntlet" at www.devX.go-parallel.com as well as the author of the Intel Press book titled "VTune™ Performance Analyzer Essentials" and contributor to the new book "Multi-Core Programming."
On-Demand
A Gentle Introduction to Parallel Software
Dr. Tim Mattson, Principal Engineer at Intel's Microprocessor Technology Labs, will lead a webinar focused on actual code and the parallel programming APIs available to software developers. Tim will begin with an overview of the high level issues that apply to the task of creating a parallel program and then move on to consider the most commonly used parallel algorithms. He will then discuss the major parallel programming APIs (OpenMP*, MPI, and Windows* threads) showing how they are used with different algorithms and different platforms. After attending this webinar, developers should be conversant with major concurrent APIs and algorithms and be well positioned to start incorporating these techniques in their applications.
Speaker: Dr. Tim Mattson, Principle Engineer, Intel Application Research Laboratory
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
On-Demand
Software Performance Analysis for Multi-Core CPUs and Windows Vista*
New operating systems continue to be introduced, while CPU cores multiply at a dizzying pace. This webinar will describe some of the special considerations presented by performance optimizations in multiple core environments on Microsoft's new Vista* operating system. We will demonstrate performance analysis using both low intrusive interrupt technology as well as instrumentation based techniques. Our primary tool in this will be the Intel® VTune™ Analyzer. Along the way we will discuss and show lesser known VTune™ Analyzer features focused on these environments. In particular, we will talk about our latest research into selecting the best CPU performance events to use with the VTune™ Analyzer's Event Based Sampling feature in identifying exactly what operations are slowing down the processor.
Speaker: Gary Carleton
Gary Carleton is a Senior Staff Software Engineer in the Performance Analysis Tools group at Intel Corp. He has been at Intel for 22 years and currently works on SW performance tools including the VTune™ Performance Analyzer. He has been an engineering manager and software engineer for Intel Corp, Cadre Technologies and Kaiser Engineers. He has a BS in Electrical Engineering and Computer Sciences from the University of California at Berkeley.
On-Demand
Three Steps to Threading and Performance
Part 1 - Thread Correctness: Maintaining Deterministic Results in Developing, Maintaining and Tuning Threaded Software
Part 1 of Three Steps to Threading and Performance discusses recommended techniques for managing unique correctness challenges in developing, maintaining and tuning threaded software. In this webinar, Dr. David Mackay discusses the techniques and processes needed to manage this effectively across groups. He presents a quick overview of Intel® Thread Checker, a product that finds challenging data races and deadlocks, and also has advanced features for development, debugging, tuning and maintenance. Attendees at this webinar will receive a grounding in the practices and techniques for developing, maintaining and tuning high quality threaded software.
Speaker: Dr. David Mackay
Dr. David Mackay is the technical lead for the Performance Analysis and Threading Tools consulting engineers team. David has been working with Intel® Threading Tools since joining the software products division in 2001. David joined Intel 1992 as part of the Supercomputer Systems Division. He has been working on software optimization since then. David Mackay received his Ph.D. from Stanford University.
On-Demand
Three Steps to Threading and Performance
Part 2 - Expressing Parallelism: Case Studies with Intel® Threading Building Blocks

Part 2 of Three Steps to Threading and Performance examines major threading methodologies and paradigms. Victoria Gromova will discuss Intel® Threading Building Blocks, a C++ template-based runtime library. Building on the strengths of familiar programming tools such as the Standard Template Library, Intel® Threading Building Blocks provides generic parallel containers, idioms and paradigms to express parallelism without managing the threads yourself. Victoria will demonstrate the strength of generic-based coding for concurrency, performance and scalability through its application to a complex games physics engine library. She will briefly contrast OpenMP*, OS threads and Intel® Threading Building Blocks tasks. Attendees at this webcast will understand why, when and how to begin using the Intel® Threading Building Blocks to introduce concurrency into their serial code.
Speaker: Victoria Gromova
Victoria Gromova works as a Senior Software Engineer at Intel Corporation. She contributed to design and development of tools for multi-threaded application performance analysis, such as Intel® Thread Profiler. Victoria joined the Performance Analysis and Threading Tools consulting team in early 2006 and has worked on many threading projects, including threading open source libraries for simulating rigid body dynamics.
On-Demand
Three Steps to Threading and Performance
Part 3 - Tuning Threaded Software: Next Steps After Concurrency

Part 3 of Three Steps to Threading and Performance addresses optimization issues faced by developers after getting their threaded applications to run correctly and to produce deterministic results. Vasanth Tovinkere suggests two relatively simple but crucial steps for creating optimized threaded code. The first step is an inspection of the software architecture and thread interactions. This is key to overcoming the performance degradation often encountered in newly threaded applications. Next, we discuss how developers can further analyze the behavior of the application on a given platform and understand the runtime implications on the platform architecture. Our primary tools for both are the VTune™ Analyzer and the Intel® Thread Profiler. After attending this webinar, developers should be able to understand why threaded code sometimes has performance issues, how to isolate the problem as well as how to optimize the code using VTune™ and Intel® Thread Profiler.
Speaker: Vasanth Tovinkere
Vasanth Tovinkere is a Senior Staff Engineer at the Intel Performance, Analysis and Threading Lab in the Developer Products Division (DPD). His current role involves supporting the Intel® Threading Tools and consulting with strategic customers through the Threading Immersion Program. He has also been involved in the development of automatic semantic event detectors for digital sports technologies in Intel Labs. His research interests include data mining of performance and trace data and fuzzy inference engines. Vasanth began his career at Intel in 1997 as an engineer where he researched threading behavior and performance and worked with early adopters in Wall Street to enable them for multi-processor architectures. Prior to joining Intel, he was involved in the development of automated fuzzy pattern recognition algorithms for NASA's Mission to Planet Earth Program.
On-Demand
Using Intel® C++ and Fortran Compilers, Version 10.0 for Performance, Multithreading, and Security
Learning how to introduce parallelism and optimizations within serial code is not easy. In this installment of Intel's multi-core webinar series, you will learn how to take advantage of Intel's new parallel optimization technology to allow you to vectorize the inner loops and parallelize outer loops without conflicts for both auto-parallelization and OpenMP*. We will also cover how to find some common, but annoying, security issues and coding errors in C & C++ with our new verification features.
Speaker: Joe Wolf, Manager, Intel® Compilers Technical Consulting and Support Team, Intel® Software Development Products
Joe Wolf has worked with compilers and tools for applications from supercomputers to the desktop for over twenty years as a developer, in technical support and training for Intel's software tools, and now as a manager of the Intel® Compilers Technical Consulting and Support Team. He has specialized in vectorizing and parallelizing compilers, as well as multithreading.
On-Demand
The Good, the Bad, and the Ugly: Improve Parallel Application Quality and Performance
In order to get more performance from today's multicore processors, software must run in parallel. Explore the most common design patterns and anti-patterns when creating error free, efficient, multicore software. Find out how to create good parallel software, and eliminate the bad, and the ugly, using Intel® software development tools and examples based on the movie. Recommended companion technical webinar: Easy Ways to Solve Parallel Performance Challenges.
Speaker: Eric Moore, Senior Software Engineer, Performance, Analysis, and Threading Group, Intel® Software Development Products
Eric has worked at Rational, Microsoft, RealNetworks, Digital, Compaq, and Keane. His specialties include performance tuning, threading, compilers, CPU architecture, operating systems, and high performance computing. In the past 9 years, Moore has trained and consulted with more than 1000 engineers in performance optimization, including engineers in healthcare, games, oil, manufacturing, labs, universities, enterprise, security, and the military, and all over the world, including North America, South America, Asia, and Europe.
On-Demand
Identify and Address Threading Opportunities
Parallelize client applications using Intel® Parallel Studio. Identify where to parallelize code and how to go about making the changes. This demonstration covers key tool capabilities---identify hot spots that would benefit from threading, use speculative evaluation to find threading barriers, determine if barriers are really limiting or can be overcome, and overcome threading barriers by adding locks or restructuring code. Effective techniques combined with compelling examples using OpenMP* and Intel® Threading Building Blocks will help developers apply insights to applications and take advantage of multicore hardware for better performance. Recommended companion technical webinar: The Key to Scaling Applications for Multicore.
Speaker: Caroline Davidson, Staff Software Engineer, Intel® Software Development Products
Caroline joined Intel through the acquisition of Compaq Corporation's Visual Fortran team in 2001. She has over 11 years compiler code generation experience as a member of Digital Equipment's GEM Compiler System and over 10 years experience integrating with Microsoft Visual Studio, Borland, and Apple Xcode IDEs. Her roles have included new product R&D, program management, and third party liaison. Caroline earned her B.S. in computer science from SUNY, Stony Brook.
On-Demand
Simplifying Parallelism Implementation with Intel® Threading Building Blocks
Use the Intel® Threading Building Blocks (Intel® TBB) template library to introduce parallelism into applications. The use of Lambda expressions available in Intel® Parallel Composer are discussed, along with data parallel and task parallel models of parallel programming. Specific focus is placed on representing common parallel programming patterns, such as pipelines and concurrent queues, using Intel TBB templates. The newest enhancements to the Intel TBB library are also explored, including task-to-thread affinity and task cancellation support.
Speaker: Michael D'Mello, Senior Technical Consulting Engineer, Intel® Software Development Products
For the last 20 years Mike's main area of focus has been models of parallel computation and parallel algorithms. He has worked in the area of parallel computing for Thinking Machines Corporation, Convex Computer Corporation, The Hewlett-Packard Company, and Intel Corporation. He holds a Ph.D. in the field of quantum dynamics from the University of Texas, Austin. Mike has been with Intel Corporation since 2003.
On-Demand
Static Analysis and Intel® C++ Compilers
Static analysis helps find application issues, such as runtime error conditions, resource leaks, and security issues. Explore the static analysis capabilities of Intel® Parallel Composer's C++ Compiler and Source Checker tool. Source Checker acts as "parallel lint" to provide source file diagnostics that help eliminate bugs, boundary violations, and memory corruption. It builds on the compiler’s interprocedural analysis capability to provide whole-program error detection including routine mismatches, variable misuse, OpenMP directive errors, and more.
Speaker: Dmitry Putunin
On-Demand
Solve Parallelism with Intel® Parallel Studio
Get a first-hand technical walk-through of the parallel programming tools in the new Intel® Parallel Studio. This demonstration addresses parallelism opportunities in the classic NQueens problem many developers are familiar with from their computer science or engineering training. The NQueens solutions that will be used in this webinar ships with Parallel Studio, so developers who download and install Parallel Studio can follow along on their own systems.
Speaker: Joe Wolf, Manager, Intel® Compilers Technical Consulting and Support Team, Intel® Software Development Products
Joe has worked with compilers and tools for applications from supercomputers to the desktop for over twenty years in software development, technical support, and training for Intel software tools. He specializes in vectorizing and parallelizing compilers.
On-Demand
Parallel Implementation Methods with Intel® Parallel Composer
In-depth coverage of C/C++ parallelization methods for Microsoft Visual Studio* C++ developers. Find out how to use Intel® Parallel Studio to apply parallelism methods such as OpenMP* 3.0, Intel® C++ language extensions for parallelism, Lambda functions, threaded libraries, Intel® Threading Building Blocks, Intel® Integrated Performance Primitives, and more. This session also demonstrates debugging tools such as Intel® Parallel Debugger Extension and Intel® Parallel Inspector. Recommended companion technical webinar: Simplify Parallelism with Intel® Parallel Composer.
Speaker: Ganesh Rao, Intel® Compiler Lab, Intel® Software Development Products
Ganesh helps customers implement concurrency in applications and take advantage of optimization techniques offered by the Intel® C++ Compilers in the real world. Ganesh has more than 15 years of experience in the areas of application tuning and benchmarking. He has a broad array of applications experience including computer games, enterprise applications, and high performance computing environments. Prior to Joining the Intel Compiler lab 9 years ago, he helped with performance modeling of chipsets in the microprocessor group.
On-Demand
Find Errors in Windows C++ Parallel Applications
Address parallelism debugging issues encountered when adding parallelism to existing code or creating new parallelism applications. This high-level survey examines the Intel® Parallel Debugger Extension for those familiar with Microsoft Visual Studio*, Visual C++, and the Microsoft Visual Studio debugger. It covers topics such as how to zero-in on data for analysis, set up filters to control the amount of data collected, serialize parallel regions without recompilation, and use a new class of data breakpoints. The hands-on webinar shows the debugger in use, including added windows to help visualize logs. Recommended companion technical webinar: Debugging Parallel Code for Fast, Reliable Applications.
Speakers: Robert Mueller-Albrecht, Intel® Software Development Products and Bernth Andersson
After completing his MSc at the University of Kaiserslautern, Robert spent another two years in the field of physics research before joining CAD-UL in 2000. There he focused on customer support and technical consultancy for embedded systems development tools. Since 2001, he has continued in this focus at Intel in Arizona working with development tools solutions for the cellular and handheld space. In recent years, Robert's focus has shifted towards consulting, evangelism ,and requirements gathering for development tools targeting the Intel®) Atom™ Processor, debug solutions for the consumer electronics space, and the world of multithreaded, highly parallel applications.