On-Demand
The Key to Scaling Applications for Multicore
Whether an application is serial, partially parallel, or fully parallel it can get significant benefit from parallelism. New Intel® Parallel Studio tools provide Windows* developers with the keys to get the most out of parallelism. Gain an in-depth understanding of when, where, and how much to use parallelism to achieve optimal results. Microsoft* Visual Studio C/C++ developers will learn how to identify and safely design applications that can scale with increasing processor core counts. Recommended companion technical webinar: Identify and Address Threading Opportunities.
Speakers: Paul Petersen, Senior Principal Engineer, Intel Corporation and Mark Davis, Senior Principal Engineer, Performance, Analysis, and Threading Lab, Intel® Software Development Products
Paul is based in Champaign, IL, having joined Intel through the acquisition of Kuck and Associates, Inc (KAI) in 2000. He has been active in the field of languages and tools for parallel computing. Paul was a major contributor to the creation of the OpenMP* specification (www.openmp.org), along with tools for the analysis of threaded applications including Assure, Assure for Java, Intel Thread Checker, and most recently the Intel Parallel Studio. He earned a B.S. in computer science from the University of Nebraska and a Master of Science and Ph.D. in computer science from the University of Illinois, Urbana-Champaign.
Mark has worked as an architect, in PAT and in the Emerging Product Lab, on tools to help users discover parallelism in their programs. Previously he held various positions in the Itanium Compiler Lab, including co-manager, architect, and co-manager of the Code Generator team. Mark specialized in compiler optimizations, performance analysis, parallelization, and architecture design in his earlier career at Digital and Compaq, Stardent, and Intermetrics. He holds a Ph.D. in Computer Science from Harvard.
On-Demand
Image Processing: Stop Developing Code from Scratch
There is a better way to develop code for images than writing it from scratch. Now, using new Intel® Parallel Studio products, developers can efficiently transform image processing for improved productivity and performance. Integrated with Microsoft Visual Studio* for C/C++, Intel® Parallel Composer, Intel® Parallel Amplifier, and Intel® Parallel Inspector enable developers to implement and optimize images with parallelism. Parallel development techniques, such as harmonization or Sobel filters in Intel® Integrated Performance Primitives (IPP), and OpenMP* at the primitive function level, will be used to demonstrate how to enhance image processing for multicore. Starting at a high level with a non-threaded application, Parallel Amplifier will locate hotspots within the application. As threads are added at a higher level with OpenMP, Parallel Inspector quickly finds and fixes threading errors. Implementing parallelism using Parallel Studio provides forward-scaling, saving developers from rewriting code with each new processor innovation.
Speaker: Walt Shands, Technical Consulting Engineer, Intel® Software Developer Products
Walt works at Intel on software developer tools. He has bachelors and masters degrees in computer science from Columbia University.
On-Demand
Go-Parallelism! Ease the Onramp for C/C++ Windows* Development
Leveraging 25 years creating effective tools for parallel application development, Intel now brings parallelism to the full range of Windows applications. Intel Parallel Studio gives Microsoft Visual Studio* C/C++ developers the tools they need to discover, build, debug, and optimize for multicore. Hear James Reinders, Intel's chief software evangelist, introduce the newest line of development tools for the end-to-end development cycle. There is no better time to tackle parallelism or better tools to assist in the development process. Recommended companion technical webinar: Solve Parallelism with Intel Parallel Studio.
Speaker: James Reinders, Director and Chief Evangelist, Intel® Software Development Products
Reinders is leading the development efforts for performance optimizing software tools including compilers, libraries, and threading analysis products. In 1989, Reinders joined Intel Corporation as a senior engineer and has contributed to projects including the world's first TeraFLOP supercomputer (ASCI Red), compilers, and architecture work for a number of Intel processors and parallel systems. Reinders is the author of the latest O'Reilly Nutshell book "Intel® Threading Building Blocks," a monthly columnist for the "The Gauntlet," found online at go-parallel.com, and the author of the book "VTune™ Performance Analyzer Essentials."
On-Demand
Simplify Parallelism with Intel® Parallel Composer
Parallelism expert Joe Wolf guides developers through a comprehensive tour of the new Intel® Parallel Composer. One of the advanced products in Intel® Parallel Studio, Parallel Composer will be used to showcase how OpenMP* 3.0 and Intel® C++ language extensions work in parallel applications. See how Lambda functions, threaded libraries, Intel® Threading Building Blocks, and Valarray enabled with Intel® Integrated Performance Primitives provide effective tools to introduce, enhance, and optimize threaded applications for multicore. Whether new to parallelism or an expert, every C/C++ Microsoft* Visual Studio developer can benefit from this tour. Recommended companion technical webinars: Parallel Implementation Methods with Intel Parallel Composer, and Simplifying Parallel Implementation with Intel Threading Building Blocks.
Speaker: Joe Wolf, Manager, Intel® Compilers Technical Consulting and Support Team, Intel® Software Development Products
Joe Wolf has worked with compilers and tools for applications from supercomputers to the desktop for over twenty years as a developer, in technical support and training for Intel's software tools, and now as a manager of the Intel® Compilers Technical Consulting and Support Team. He has specialized in vectorizing and parallelizing compilers, as well as multithreading.
On-Demand
Debugging Parallel Code for Fast, Reliable Applications
Memory errors, data races, and deadlock are notorious yet critical issues to track down in threaded apps. Learn new techniques using Intel® Parallel Studio developer tools and save hours of debugging time, while improving application reliability. Intel® Parallel Inspector offers unique threading analysis techniques, drilling down to source code lines where problems can occur, and enabling developers to locate and isolate common threading problems. Learn how to use Parallel Inspector to find memory leaks and common memory overruns. Tap into debugging extension plug-ins and use error checking capabilities found in Parallel Studio to improve application reliability and performance. Recommended companion technical webinars: Find Errors in Windows C++ Applications, and Static Analysis and Intel® C++ Compilers.
Speaker: Jay DeSouza, Intel® Software Development Products
Over the last four years at Intel, DeSouza's roles have included new product R&D, product lifecycle management, and strategic customer engagement. He has substantially improved production code performance for several Intel customers through use of Intel® Software Tools. DeSouza received his Ph.D. from the University of Illinois at Urbana-Champaign, where he specialized in parallel programming.
On-Demand
Easy Ways to Solve Parallel Performance Challenges
New innovations bring new challenges. For many C/C++ developers, introducing parallelism means spending hours tuning an application for multicore performance. Learn techniques with a new performance tuning profiler found in Intel® Parallel Studio and quickly identify performance issues. Using application source code, Intel parallelism expert Gary Carleton demonstrates how developers can quickly solve the three most common performance issues: (1) bottlenecks, (2) locks and waits, and (3) amount and locations of threads. Windows* developers now have a tool that brings new levels of transparency for quickly and accurately tuning threaded applications for optimal performance. Recommended companion technical webinar: The Good, the Bad, and the Ugly: Improve Parallel Application Quality and Performance.
Speaker: Gary Carleton, Senior Staff Software Engineer, Performance Analysis and Threading Lab, Intel® Corporation
Gary Carleton is a Senior Staff Software Engineer in the Performance Analysis Tools group at Intel Corp. He has been at Intel for 22 years and currently works on SW performance tools including the VTune™ Performance Analyzer. He has been an engineering manager and software engineer for Intel Corp, Cadre Technologies and Kaiser Engineers. He has a BS in Electrical Engineering and Computer Sciences from the University of California at Berkeley.
On-Demand
How Fast Is Fast: Measuring and Understanding Parallel Performance
In this seminar, Intel will show examples of critical parallelism development functions including accessing various system timers; the performance of various locks in Intel® Threading Building Blocks (TBB) and on various hardware architectures; the pointlessness of optimizing applications that are not on a critical path; and measurement mistakes such as unsynchronized clocks, adjusting for GHz differences, and Amdahl limitations on speedup.
Speakers: Dr. Tim Mattson, Principle Engineer, Intel Application Research Laboratory; Dr. Jayant DeSouza, Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
Over the last four years at Intel, DeSouza's roles have included new product R&D, product lifecycle management, and strategic customer engagement. He has substantially improved production code performance for several Intel customers through use of Intel® Software Tools. DeSouza received his Ph.D. from the University of Illinois at Urbana-Champaign, where he specialized in parallel programming.
On-Demand
A Practical Threading Methodology
While threading can be a challenge, new software development tools help simplify the process by identifying thread correctness issues and performance opportunities. We will present a case study on threading a Black Scholes application—a common application in the financial services industry—to improve application performance and responsiveness. We will then look at a methodology used to successfully thread applications and valuable tools that support this threading methodology.
Speakers: John O'Neill, Engineer, Intel Developer Products Division; Shwetha Doss, Technical Consulting Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
O'Neill currently works on the Intel® C++ Compiler. He has been with for Intel for eight years as a compiler technical consultant, presented talks at numerous conferences, and written many articles and white papers. O'Neill works closely with companies in financial services and digital media, and enterprise ISVs, helping them take advantage of Intel® software and hardware technologies. Before joining Intel, John was a research associate at the University of Minnesota, conducting research in elementary particle physics and developing large scale scientific applications. O'Neill has published articles in refereed journals and books, and holds a Ph.D. in physics from the University at Albany, State University of New York.
Doss supports the Intel® Threading Analysis Tools and consults with strategic customers. She has also written papers and presented talks about the tools.
On-Demand
Fundamental Topics in the Design of Parallel Programs
This seminar introduces several topics all developers of parallel code should know and be familiar with, such as parallel algorithm design, data and functional decomposition, various synchronization primitives, standard scheduling mechanisms, work stealing, cache coherence, and false sharing.
Speaker: Levent Akyil, Software Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
Akyil has been with Intel for seven years, and held positions in the Digital Enterprise and Software and Solutions Groups. He worked on various Intel® Itanium®-based platforms and enterprise MP server projects before moving to his current role, where he provides technical consulting support for Intel® Software Developer Products. Akyil works with internal and external strategic customers providing enabling and optimization support on Intel® platforms. He has an M.S. in computer science and an MBA in technology and innovation management.
On-Demand
Optimizing Parallel Programs: Symptoms to Solutions
A systematic approach to symptom analysis can help solve multi-core performance and correctness issues. We will cover a set of symptoms and their possible causes. These include slowdown despite adding cores or increasing the number of threads, slowdown when giving threads larger pieces of work, slowdown due to scheduling issues, and slowdown/errors due to synchronization, etc.
Speaker: Dr. Jayant DeSouza, Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
Over the last four years at Intel, DeSouza's roles have included new product R&D, product lifecycle management, and strategic customer engagement. He has substantially improved production code performance for several Intel customers through use of Intel® Software Tools. DeSouza received his Ph.D. from the University of Illinois at Urbana-Champaign, where he specialized in parallel programming.
On-Demand
Boosting Performance of Imaging Solutions by Adopting New Deferred Mode Image Processing (DMIP) Layer
A typical image processing task handles data type conversion, filtering or threshold, one after another, and applies Intel® Integrated Performance Primitives (Intel® IPP) libraries without considering the order of calculations. With demands for dealing with larger images in complex image processing tasks, additional improvements for pipelined operations on images are more important. These improvements support better utilization in memory optimization and much faster performance on multi-threaded environments. We'll introduce a new implementation Deferred Mode Image Processing (DMIP) layer built on top of Intel IPP. We'll also share the latest performance benchmarks comparison for typical image processing tasks used in imaging solutions in medical and multimedia applications.
Speaker: Ying Song, Technical Consulting Engineer for Intel® Integrated Performance Primitives, Intel Software Solutions Group
Song is responsible for consulting with applications developers on their use of Intel Integrated Performance Primitives libraries. She has worked for Intel for 11 years.
On-Demand
Future Parallelization Technologies?
Whatif.intel.com hosts Intel's latest prototype products that explore new parallelization technologies, as well as community forums to discuss these ideas with other software technologists. Find out what Whatif.intel.com has to offer.
Speaker: Ganesh Rao, Intel Developer Products Division
Ganesh has more than 15 years of experience in the areas of application tuning, benchmarking, and developer support. Ganesh has a broad array of applications experience, including computer games, enterprise applications and high performance computing environments. Currently Ganesh is focusing on training and supporting customers with development tools and benchmarks.
On-Demand
The Concurrency Revolution
Although driven by the industry-wide shift to multi-core hardware architectures, concurrency is primarily a software revolution. We are now seeing the initial stages of the next major change in software development, as over the next few years the software industry brings concurrency pervasively into mainstream software development, just as it has done in the past for objects, garbage collection, generics, and other technologies. Sutter summarizes the issues involved, gives an overview of the impact, and describes what to expect over the coming decade.
Speaker: Herb Sutter, Software Architect, Microsoft
Sutter is a chair of the ISO C++ standards committee. Among his books and papers is the widely-cited article "The Free Lunch Is Over" in which he coined the phrase "concurrency revolution" to describe the software sea change now in progress to exploit increasingly parallel hardware.
On-Demand
Steps to Parallelism NOW
Practical advice on how to put parallelism in your programs today. Reinders will discuss how to pick the best approach and avoid common pitfalls, and will share his favorite rules of thumb on how to succeed with parallel programming. Reinders will set the context for the rest of the series, and explain how you can approach parallelism now in your applications.
Speaker: James Reinders, Chief Evangelist of Intel Software Products
Reinders is leading the development efforts for performance optimizing software tools including compilers, libraries, and threading analysis products. In 1989, Reinders joined Intel Corporation as a senior engineer and has contributed to projects including the world's first TeraFLOP supercomputer (ASCI Red), compilers, and architecture work for a number of Intel processors and parallel systems. Reinders is the author of the latest O'Reilly Nutshell book "Intel® Threading Building Blocks," a monthly columnist for the "The Gauntlet," found online at go-parallel.com, and the author of the book "VTune™ Performance Analyzer Essentials."
On-Demand
Threading for Performance
Doss will discuss some common performance issues specific to multithreaded applications and how these can be identified with analysis tools such as the VTune Performance Analyzer and Intel Thread Profiler and addressed with Intel® Threading Building Blocks (TBB). Intel TBB, a C++ template-based runtime library, consists of inherently scalable parallel algorithms and data structures that scale applications automatically as more processors and cores are detected in the underlying hardware. Intel TBB also helps address and fix performance issues that result from load imbalance and memory allocation.
Speaker: Shwetha Doss, Technical Consulting Engineer at the Intel Performance Analysys and Threading Lab
Doss' current role involves supporting the Intel® Threading Analysis Tools and consulting with strategic customers. She has also written papers and given talks about Intel® Threading Analysis tools.
On-Demand
Parallelism Programming Has Gone Mainstream: Are You Ready?
Tim Mattson will begin by discussing the trends driving parallel computing into the mainstream; with multi-core processors now and processors composed of many, heterogeneous cores in the future. He will show how Intel is uniquely positioned to help software developers make the transition from sequential to parallel software. Parallelizing software can be complex and for some, this will be a difficult transition. For those who are well prepared, however, parallel computing is an opportunity to get a jump on the competition. Tim will close with a quick look at algorithms used in parallel computing and some domains where it has proven particularly valuable. James Reinders will then move from theory to practice. He will discuss the methodologies, techniques, and tools Intel has in place to help software developers' transition to parallel computing. Finally James will introduce our spring webinar series where Intel's world-class programmers will do a deep dive on different facets of parallel computing and show what's new in the world of Intel's developer and threading tools.
Speakers: Dr. Tim Mattson, Principle Engineer, Intel Application Research Laboratory; James Reinders, Chief Evangelist of Intel Software Products
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
James Reinders is a senior engineer who joined Intel Corporation in 1989 and has contributed to projects including the world's first TeraFLOP supercomputer (ASCI Red), compilers and architecture work for the iWarp, Pentium® Pro, Pentium II, Itanium®, and Pentium® 4 processors. Reinders is currently the director of business development and marketing for Intel's Software Development Products and serves as the chief evangelist and spokesperson. He has been a leader in the creation of Intel's Software Products including product plans, support, technical marketing, marketing and business development. Reinders is also the editorial columnist for the monthly "The Gauntlet" at www.devX.go-parallel.com as well as the author of the Intel Press book titled "VTune™ Performance Analyzer Essentials" and contributor to the new book "Multi-Core Programming."
On-Demand
A Gentle Introduction to Parallel Software
Dr. Tim Mattson, Principal Engineer at Intel's Microprocessor Technology Labs, will lead a webinar focused on actual code and the parallel programming APIs available to software developers. Tim will begin with an overview of the high level issues that apply to the task of creating a parallel program and then move on to consider the most commonly used parallel algorithms. He will then discuss the major parallel programming APIs (OpenMP*, MPI, and Windows* threads) showing how they are used with different algorithms and different platforms. After attending this webinar, developers should be conversant with major concurrent APIs and algorithms and be well positioned to start incorporating these techniques in their applications.
Speaker: Dr. Tim Mattson, Principle Engineer, Intel Application Research Laboratory
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
On-Demand
Software Performance Analysis for Multi-Core CPUs and Windows Vista*
New operating systems continue to be introduced, while CPU cores multiply at a dizzying pace. This webinar will describe some of the special considerations presented by performance optimizations in multiple core environments on Microsoft's new Vista* operating system. We will demonstrate performance analysis using both low intrusive interrupt technology as well as instrumentation based techniques. Our primary tool in this will be the Intel® VTune™ Analyzer. Along the way we will discuss and show lesser known VTune™ Analyzer features focused on these environments. In particular, we will talk about our latest research into selecting the best CPU performance events to use with the VTune™ Analyzer's Event Based Sampling feature in identifying exactly what operations are slowing down the processor.
Speaker: Gary Carleton
Gary Carleton is a Senior Staff Software Engineer in the Performance Analysis Tools group at Intel Corp. He has been at Intel for 22 years and currently works on SW performance tools including the VTune™ Performance Analyzer. He has been an engineering manager and software engineer for Intel Corp, Cadre Technologies and Kaiser Engineers. He has a BS in Electrical Engineering and Computer Sciences from the University of California at Berkeley.
On-Demand
Three Steps to Threading and Performance
Part 1 - Thread Correctness: Maintaining Deterministic Results in Developing, Maintaining and Tuning Threaded Software
Part 1 of Three Steps to Threading and Performance discusses recommended techniques for managing unique correctness challenges in developing, maintaining and tuning threaded software. In this webinar, Dr. David Mackay discusses the techniques and processes needed to manage this effectively across groups. He presents a quick overview of Intel® Thread Checker, a product that finds challenging data races and deadlocks, and also has advanced features for development, debugging, tuning and maintenance. Attendees at this webinar will receive a grounding in the practices and techniques for developing, maintaining and tuning high quality threaded software.
Speaker: Dr. David Mackay
Dr. David Mackay is the technical lead for the Performance Analysis and Threading Tools consulting engineers team. David has been working with Intel® Threading Tools since joining the software products division in 2001. David joined Intel 1992 as part of the Supercomputer Systems Division. He has been working on software optimization since then. David Mackay received his Ph.D. from Stanford University.
On-Demand
Three Steps to Threading and Performance
Part 2 - Expressing Parallelism: Case Studies with Intel® Threading Building Blocks
Part 2 of Three Steps to Threading and Performance examines major threading methodologies and paradigms. Victoria Gromova will discuss Intel® Threading Building Blocks, a C++ template-based runtime library. Building on the strengths of familiar programming tools such as the Standard Template Library, Intel® Threading Building Blocks provides generic parallel containers, idioms and paradigms to express parallelism without managing the threads yourself. Victoria will demonstrate the strength of generic-based coding for concurrency, performance and scalability through its application to a complex games physics engine library. She will briefly contrast OpenMP*, OS threads and Intel® Threading Building Blocks tasks. Attendees at this webcast will understand why, when and how to begin using the Intel® Threading Building Blocks to introduce concurrency into their serial code.
Speaker: Victoria Gromova
Victoria Gromova works as a Senior Software Engineer at Intel Corporation. She contributed to design and development of tools for multi-threaded application performance analysis, such as Intel® Thread Profiler. Victoria joined the Performance Analysis and Threading Tools consulting team in early 2006 and has worked on many threading projects, including threading open source libraries for simulating rigid body dynamics.
On-Demand
Three Steps to Threading and Performance
Part 3 - Tuning Threaded Software: Next Steps After Concurrency
Part 3 of Three Steps to Threading and Performance addresses optimization issues faced by developers after getting their threaded applications to run correctly and to produce deterministic results. Vasanth Tovinkere suggests two relatively simple but crucial steps for creating optimized threaded code. The first step is an inspection of the software architecture and thread interactions. This is key to overcoming the performance degradation often encountered in newly threaded applications. Next, we discuss how developers can further analyze the behavior of the application on a given platform and understand the runtime implications on the platform architecture. Our primary tools for both are the VTune™ Analyzer and the Intel® Thread Profiler. After attending this webinar, developers should be able to understand why threaded code sometimes has performance issues, how to isolate the problem as well as how to optimize the code using VTune™ and Intel® Thread Profiler.
Speaker: Vasanth Tovinkere
Vasanth Tovinkere is a Senior Staff Engineer at the Intel Performance, Analysis and Threading Lab in the Developer Products Division (DPD). His current role involves supporting the Intel® Threading Tools and consulting with strategic customers through the Threading Immersion Program. He has also been involved in the development of automatic semantic event detectors for digital sports technologies in Intel Labs. His research interests include data mining of performance and trace data and fuzzy inference engines. Vasanth began his career at Intel in 1997 as an engineer where he researched threading behavior and performance and worked with early adopters in Wall Street to enable them for multi-processor architectures. Prior to joining Intel, he was involved in the development of automated fuzzy pattern recognition algorithms for NASA's Mission to Planet Earth Program.
On-Demand
Using Intel® C++ and Fortran Compilers, Version 10.0 for Performance, Multithreading, and Security
Learning how to introduce parallelism and optimizations within serial code is not easy. In this installment of Intel's multi-core webinar series, you will learn how to take advantage of Intel's new parallel optimization technology to allow you to vectorize the inner loops and parallelize outer loops without conflicts for both auto-parallelization and OpenMP*. We will also cover how to find some common, but annoying, security issues and coding errors in C & C++ with our new verification features.
Speaker: Joe Wolf, Manager, Intel® Compilers Technical Consulting and Support Team, Intel® Software Development Products
Joe Wolf has worked with compilers and tools for applications from supercomputers to the desktop for over twenty years as a developer, in technical support and training for Intel's software tools, and now as a manager of the Intel® Compilers Technical Consulting and Support Team. He has specialized in vectorizing and parallelizing compilers, as well as multithreading.
On-Demand
The Good, the Bad, and the Ugly: Improve Parallel Application Quality and Performance
In order to get more performance from today's multicore processors, software must run in parallel. Explore the most common design patterns and anti-patterns when creating error free, efficient, multicore software. Find out how to create good parallel software, and eliminate the bad, and the ugly, using Intel® software development tools and examples based on the movie. Recommended companion technical webinar: Easy Ways to Solve Parallel Performance Challenges.
Speaker: Eric Moore, Senior Software Engineer, Performance, Analysis, and Threading Group, Intel® Software Development Products
Eric has worked at Rational, Microsoft, RealNetworks, Digital, Compaq, and Keane. His specialties include performance tuning, threading, compilers, CPU architecture, operating systems, and high performance computing. In the past 9 years, Moore has trained and consulted with more than 1000 engineers in performance optimization, including engineers in healthcare, games, oil, manufacturing, labs, universities, enterprise, security, and the military, and all over the world, including North America, South America, Asia, and Europe.
On-Demand
Identify and Address Threading Opportunities
Parallelize client applications using Intel® Parallel Studio. Identify where to parallelize code and how to go about making the changes. This demonstration covers key tool capabilities---identify hot spots that would benefit from threading, use speculative evaluation to find threading barriers, determine if barriers are really limiting or can be overcome, and overcome threading barriers by adding locks or restructuring code. Effective techniques combined with compelling examples using OpenMP* and Intel® Threading Building Blocks will help developers apply insights to applications and take advantage of multicore hardware for better performance. Recommended companion technical webinar: The Key to Scaling Applications for Multicore.
Speaker: Caroline Davidson, Staff Software Engineer, Intel® Software Development Products
Caroline joined Intel through the acquisition of Compaq Corporation's Visual Fortran team in 2001. She has over 11 years compiler code generation experience as a member of Digital Equipment's GEM Compiler System and over 10 years experience integrating with Microsoft Visual Studio, Borland, and Apple Xcode IDEs. Her roles have included new product R&D, program management, and third party liaison. Caroline earned her B.S. in computer science from SUNY, Stony Brook.
On-Demand
Simplifying Parallelism Implementation with Intel® Threading Building Blocks
Use the Intel® Threading Building Blocks (Intel® TBB) template library to introduce parallelism into applications. The use of Lambda expressions available in Intel® Parallel Composer are discussed, along with data parallel and task parallel models of parallel programming. Specific focus is placed on representing common parallel programming patterns, such as pipelines and concurrent queues, using Intel TBB templates. The newest enhancements to the Intel TBB library are also explored, including task-to-thread affinity and task cancellation support.
Speaker: Michael D'Mello, Senior Technical Consulting Engineer, Intel® Software Development Products
For the last 20 years Mike's main area of focus has been models of parallel computation and parallel algorithms. He has worked in the area of parallel computing for Thinking Machines Corporation, Convex Computer Corporation, The Hewlett-Packard Company, and Intel Corporation. He holds a Ph.D. in the field of quantum dynamics from the University of Texas, Austin. Mike has been with Intel Corporation since 2003.
On-Demand
Static Analysis and Intel® C++ Compilers
Static analysis helps find application issues, such as runtime error conditions, resource leaks, and security issues. Explore the static analysis capabilities of Intel® Parallel Composer's C++ Compiler and Source Checker tool. Source Checker acts as "parallel lint" to provide source file diagnostics that help eliminate bugs, boundary violations, and memory corruption. It builds on the compiler’s interprocedural analysis capability to provide whole-program error detection including routine mismatches, variable misuse, OpenMP directive errors, and more.
Speaker: Dmitry Putunin
On-Demand
Solve Parallelism with Intel® Parallel Studio
Get a first-hand technical walk-through of the parallel programming tools in the new Intel® Parallel Studio. This demonstration addresses parallelism opportunities in the classic NQueens problem many developers are familiar with from their computer science or engineering training. The NQueens solutions that will be used in this webinar ships with Parallel Studio, so developers who download and install Parallel Studio can follow along on their own systems.
Speaker: Joe Wolf, Manager, Intel® Compilers Technical Consulting and Support Team, Intel® Software Development Products
Joe has worked with compilers and tools for applications from supercomputers to the desktop for over twenty years in software development, technical support, and training for Intel software tools. He specializes in vectorizing and parallelizing compilers.
On-Demand
Parallel Implementation Methods with Intel® Parallel Composer
In-depth coverage of C/C++ parallelization methods for Microsoft Visual Studio* C++ developers. Find out how to use Intel® Parallel Studio to apply parallelism methods such as OpenMP* 3.0, Intel® C++ language extensions for parallelism, Lambda functions, threaded libraries, Intel® Threading Building Blocks, Intel® Integrated Performance Primitives, and more. This session also demonstrates debugging tools such as Intel® Parallel Debugger Extension and Intel® Parallel Inspector. Recommended companion technical webinar: Simplify Parallelism with Intel® Parallel Composer.
Speaker: Ganesh Rao, Intel® Compiler Lab, Intel® Software Development Products
Ganesh helps customers implement concurrency in applications and take advantage of optimization techniques offered by the Intel® C++ Compilers in the real world. Ganesh has more than 15 years of experience in the areas of application tuning and benchmarking. He has a broad array of applications experience including computer games, enterprise applications, and high performance computing environments. Prior to Joining the Intel Compiler lab 9 years ago, he helped with performance modeling of chipsets in the microprocessor group.
On-Demand
Find Errors in Windows C++ Parallel Applications
Address parallelism debugging issues encountered when adding parallelism to existing code or creating new parallelism applications. This high-level survey examines the Intel® Parallel Debugger Extension for those familiar with Microsoft Visual Studio*, Visual C++, and the Microsoft Visual Studio debugger. It covers topics such as how to zero-in on data for analysis, set up filters to control the amount of data collected, serialize parallel regions without recompilation, and use a new class of data breakpoints. The hands-on webinar shows the debugger in use, including added windows to help visualize logs. Recommended companion technical webinar: Debugging Parallel Code for Fast, Reliable Applications.
Speakers: Robert Mueller-Albrecht, Intel® Software Development Products and Bernth Andersson
After completing his MSc at the University of Kaiserslautern, Robert spent another two years in the field of physics research before joining CAD-UL in 2000. There he focused on customer support and technical consultancy for embedded systems development tools. Since 2001, he has continued in this focus at Intel in Arizona working with development tools solutions for the cellular and handheld space. In recent years, Robert's focus has shifted towards consulting, evangelism ,and requirements gathering for development tools targeting the Intel®) Atom™ Processor, debug solutions for the consumer electronics space, and the world of multithreaded, highly parallel applications.
On-Demand
Stanford University builds 212-node Intel® Cluster Ready (ICR) Cluster in 11 days
Dell, Clustercorp, Panasas, and Intel discuss the different approach used to enable fast cluster installation and use.
Scalable HPC with significant performance boosts is now a reality. In just 11 days, the Stanford University High-Performance Computing Center (HPCC) was able to fully implement a 212 node, 1,696 processor solution provided by the Intel® Cluster Ready Program, Dell, Clustercorp, and Panasas. Simplified standards-based clustering is enabling faster, more accurate HPC and providing unprecedented flexibility for application development. This panel discussion with the key providers of the Stanford solution, offers a first-hand look at the challenges and opportunities ahead.
Speaker: Steve Jones, High Performance Computing Center, Stanford University
Jones currently runs the High Performance Computing Center at Stanford University, supporting sponsored research for The Department of Energy Advanced Simulation and Computation Program (ASC), and the next-generation Predictive Science Academic Alliance Program (PSAAP).
On-Demand
High-Confidence Clustering with Intel® Cluster Ready
The Intel® Cluster Ready (ICR) Program ensures that certified clusters are based on a standard specification, which simplifies deployment and assures interoperability. This enables HPC software and cluster providers to deliver quality solutions to wider audiences more effectively. Solutions can be designed and deployed with the confidence that any registered application will run on any certified cluster. This standards-based approach delivers clusters that are ready-to-run, providing an outstanding out-of-the-box experience.
In addition, ICR provides an automated software tool that analyzes the active configuration and performance of ICR clusters, which reduces maintenance and simplifies support. Find out how Intel Cluster Ready will enhance your productivity.
Speaker: Clem Cole, Intel Corporation
Clem Cole describes himself as an old-school hacker and "Open Sourcer." His most recent enterprise at Intel Corporation is leading the architecture and development of Intel® Cluster Ready. He has been using and developing computing systems since the late 1960s and has been directly or indirectly employed at firms small and large, from Masscomp and Stellar, to Locus Computing, to giants such as DEC/Compaq, AT&T, Sun, IBM, NCR, HP, and now Intel. Clem's work has ranged from CPU, workstation, and large system design, I/O controllers, OS development, and network protocol implementation to cluster file systems, single system image, and interconnect technologies. At least one of his projects became a business school case study (and not as a counter-example, either). Many projects became successful both commercially and as lead-ins to other research projects and spin-offs. Prior to Intel, he was vice president and distinguished engineer at Ammasso, where he helped develop the world's first iWarp implementation. Before that, he was vice president of engineering at Paceline Systems, which delivered the first 4x InfiniBand* switch with embedded subnet manager. He has degrees in EE, Math, and CS from CMU and UCB; has numerous publications and has given many talks. Clem helped to write one of the original TCP/IP implementations in the late 1970s, and was one of the authors of the precursor to IM, the UNIX talk program, as well as other more humorous and notorious hacks. He is the current vice president of the USENIX Association.
On-Demand
Intel® Cluster Ready Solutions: Integrating New Technologies
The latest server platforms from Intel and other vendors are now available in Intel® Cluster Ready (ICR) solutions. The Intel® Cluster Ready Program works with program members to include the latest Intel server and cluster technologies, such as the Intel® Rapid Boot Toolkit. Cluster nodes can be quickly replaced or rebuilt if problems arise. ICR clusters provide fully automated installation of nodes, as required by the specification. The specification also assures that a single binary application can run on different interconnects. This seminar will demonstrate available ICR solutions requiring minimal IT management.
When: On-Demand
On-Demand
Intel® Cluster Ready: Realizing "Many-to-One"
Intel® Cluster Ready (ICR) is based on two principles: clusters support any registered application, and each application will work on any certified cluster. Using real-world examples, this seminar will show how applications install and run without modification on ICR clusters. Registered applications can be implemented with minimal IT support. The ICR specification defines standards that assure ISVs can create applications for a well-defined platform. In addition to run-time production clusters, ICR provides an optional, complete development environment.
Speakers: Kevin Noreen, Dell and Clem Cole, Intel Corporation
Clem Cole describes himself as an old school hacker and "Open Sourcer." He first encountered the early editions of UNIX while a university student, writing a variety of device drivers, kernel enhancements, and microprocessor support tools for the early 4- and 8-bit microprocessors, and working on an early hardware description language (ISPS) using VAX serial #1—and has been developing operating systems and technical computing systems ever since. While his first experiences with computing were in the late 1960's, he began his official career at Tektronix in the 1970s, and has been directly or indirectly employed at firms small and large, from Masscomp and Stellar, to Locus Computing, to giants such as DEC/Compaq, AT&T, Sun, IBM, NCR, HP, and Intel. Clem’s work has ranged from CPU, workstation and large system design, I/O controllers, OS development, and network protocol implementation to cluster file systems, single system image, and interconnect technologies. At least one of his projects became a business school case study (and not as a counter-example, either). Many projects became successful both commercially and as lead-ins to other research projects and spin-offs. Clement's most recent enterprise at Intel Corporation is leading the architecture and development of Intel® Cluster Ready. Prior to Intel, he was vice president and engineer at Ammasso, where he helped develop the world's first iWARP implementation. Before that he was vice president of engineering at Paceline Systems, which delivered the first 4x Infiniband switch with embedded Subnet Manager. Says Clem, "I am aware that my British friends drink TCP to cure hacks and nasal congestion; I prefer to hack TCP to cure network congestion." He has degrees in EE, Math and CS from CMU and UCB; has numerous publications and given many talks. Clem helped to write one of the original TCP/IP implementations in the late 1970's, and was one of the authors of the precursor to IM, the UNIX talk program, as well as other more humorous and notorious hacks. He is the current vice president of the USENIX Association.
On-Demand
Right-sizing Intel® Cluster Ready Clusters for ANSYS Performance
IT departments in all manufacturing organizations need to accommodate the pressure for shorter product design cycles with a limited budget for software and hardware expenses. While Linux-based clusters can help to address the always increasing demand for compute cycles, the concept of achieving high performance through interconnected systems introduces performance and manageability challenges. The ease of implementation and management provided by Intel® Cluster Ready-certified clusters is available without impacting performance. Penguin Computing and ANSYS will review how variations of hardware components (e.g., memory, storage, interconnects) impact the application performance of ANSYS 11. Efficient cluster architecture will be introduced that addresses manageability challenges inherent to traditional Linux clusters, including how the deployment and support of applications is simplified on certified Intel® Cluster Ready systems.
Speakers: Ray Browell, ANSYS; Josh Bernstein, Penguin Computing and Arend Dittmer, Penguin Computing
Raymond Browell is a Senior Product Manager and the Mechanical Business Release Manager at ANSYS, Inc.; his products have made leading edge technology practical by providing the best possible price-performance ratio, resulting in sales growth of 100%. As Product Manager, Ray has sought out new technologies and successfully integrated them into the product lines, expanding the business by seeking acquisition candidates and working at all levels of the organization. As MBU Release Manager, Ray has gained consensus on plans such as the 12.0 MBU Development Plan and the 12.0 MBU Quality Plan, providing for Intra-Team Guidelines and Individualized goals for the three MBU development sectors. In this capacity, Ray works extensively with licensing and pricing of products.
Raymond Browell is a Mechanical Engineer (M.S. in Mechanical Engineering, Magna cum laude, University of Pittsburgh, 1983; B.S. in Mechanical Engineering, Magna cum laude, University of Pittsburgh, 1980), a licensed Professional engineer, and a member of both AIAA and ASME.
Joshua Bernstein is a Software Engineer with Penguin Computing. Joshua specializes in application integration and has a deep understanding of application characteristics that he applies to creating optimized cluster configurations for customers. Prior to working at Penguin Computing, Joshua was a Linux System Administrator at NASA's Lunar and Planetary Lab, where he worked on several missions including Cassini, HiRISE, and the Phoenix Mars Lander. Prior to NASA, Joshua worked for the College of Engineering and Mines, at the University of Arizona where he designed a cross-platform architecture for a centralized account administration exclusively based on Open Source software. Joshua has been actively involved with the Open Source community and has contributed to projects such as SAMBA, MythTV and Gallery. Joshua studied computer engineering at the University of Arizona.
Arend Dittmer is Director of Product Management for Penguin Computing. Arend is responsible for Penguin Computing's Cluster Management solution Scyld ClusterWare. Before joining Penguin Computing in mid-2006, Arend held a variety of functions, all related to the field of Linux clustering. At Fujitsu-Siemens Inc. he was a business development manager for the company's HA Linux clustering product PRIMECLUSTER. As a Principal Consultant for Qlusters Inc., Arend deployed policy driven, adaptive, enterprise computing solutions at customer sites. During his six-year tenure at Platform Computing as Integration Architect and Senior Consultant, Arend deployed and integrated workload management solutions with HPC applications. Arend holds the German equivalent (Dipl.-Ing.) of an M.S. in electrical engineering from the university Erlangen-Nuremberg (Germany).
On-Demand
Intel® Cluster Ready: European Market
Intel® Cluster Ready (ICR) is a global program, with vendors worldwide providing localized solutions. With testimonials from major European vendors, this seminar will focus on solutions for the European ecosystem. Using real-world examples, we will demonstrate how applications install and run without modification on ICR clusters, and how registered applications can be implemented with minimal IT support.
See products and solutions from Intel and other Intel® Cluster Ready vendors at the International Supercomputing Conference 2008 in Dresden, Germany.
When: On-Demand
On-Demand
Using Multithreaded Libraries to Maximize Performance for Digital Media Apps
Developing multimedia solutions becomes more complicated when supporting various audio/video/image standards and fully utilizing applications in a multi-core environment. Solutions can be found in technologies and software tools for multithreading H.264 and JPEG/JPEG2000 support for a multi-core environment. Advanced features of Intel IPP libraries include creating a video–audio playback pipeline and control threads, measuring performance, and selecting the appropriate Intel IPP library. Learn several simple steps to adopt Intel IPP libraries and enhance performance for digital media applications in multi-core environments.
Speaker: Ying Song, Technical Consulting Engineer for Intel®, Integrated Performance Primitives, Intel Software Solutions Group
Ying Song is responsible for consulting with applications developers on their use of Intel Integrated Performance Primitives libraries. She has worked for Intel for 11 years.
On-Demand
Clusters Made Easy - How to Deploy Cluster Systems Dramatically Faster
Now that MPI and OpenMP* have become standard for cluster and multi-core architectures, it's time to consider how to take full advantage of standards-based parallel computing. Most organizations using Intel clusters to develop or run modeling and simulation applications already understand the performance advantages of parallel computing. Intel® Cluster Ready clusters allow you to deploy a solution for users dramatically faster, with significant boosts in performance for developers using Intel® Cluster Tools. Learn why organizations are making Intel® cluster software technology their preferred development or production environment and explore the top things you should know to prepare for your current or next clusters.
Speaker: Werner Krotz-Vogel, Technical Marketing Engineer, Intel® Cluster Software Technology, Enterprise Software Solutions Division
Krotz-Vogel studied astrophysics in Cologne and became an expert for parallel computer architectures in the automation industry. He was system project manager for the first European parallel supercomputer SUPRENUM. As specialist for performance tools for more than a decade at Pallas, he contributed to standards such as PARMACS and MPI, and helped the Intel® Trace Analyzer and Collector reach its leading role in parallel software development tools.
On-Demand
Using Threaded Math Libraries for High Performance Computing Applications
With the advent of multi-core processors, challenges and opportunities for increased performance via parallelism have greatly increased. We will delve deeper into one of the key steps to parallelism presented in James Reinders' "Steps to Parallelism NOW" Webinar: how developers can gain great performance scaling with minimal programming effort by utilizing threaded libraries. We will examine math problems that lend themselves to parallel computation and discuss how they have been parallelized in the Intel® Math Kernal Library. We will also provide details on advanced threading control settings that can give users more control over how to utilize the increasing number of cores available on a single processor.
Speaker: Todd Rosenquist, Technical Consulting Engineer with the Intel® Math Kernel Library (Intel MKL)
Rosenquist has been involved with Intel MKL for seven years in various roles spanning engineering, release management, technical support, and professional services.