|
Date |
Speaker |
| |
April 1 Registration required. |
Dr. Tim Mattson, Principle Engineer, Intel Application Research Laboratory
Dr. Jayant DeSouza, Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
How Fast Is Fast: Measuring and Understanding Parallel Performance
Learn More |
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
Over the last four years at Intel, DeSouza's roles have included new product R&D, product lifecycle management, and strategic customer engagement. He has substantially improved production code performance for several Intel customers through use of Intel® Software Tools. DeSouza received his Ph.D. from the University of Illinois at Urbana-Champaign, where he specialized in parallel programming.
In this seminar, Intel will show examples of critical parallelism development functions including accessing various system timers; the performance of various locks in Intel® Threading Building Blocks (TBB) and on various hardware architectures; the pointlessness of optimizing applications that are not on a critical path; and measurement mistakes such as unsynchronized clocks, adjusting for GHz differences, and Amdahl limitations on speedup. |
|
| |
April 15 Registration required. |
John O'Neill, Engineer, Intel Developer Products Division Shwetha Doss, Technical Consulting Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
A Practical Threading Methodology
Learn More |
O'Neill currently works on the Intel® C++ Compiler. He has been with for Intel for eight years as a compiler technical consultant, presented talks at numerous conferences, and written many articles and white papers. O'Neill works closely with companies in financial services and digital media, and enterprise ISVs, helping them take advantage of Intel® software and hardware technologies. Before joining Intel, John was a research associate at the University of Minnesota, conducting research in elementary particle physics and developing large scale scientific applications. O'Neill has published articles in refereed journals and books, and holds a Ph.D. in physics from the University at Albany, State University of New York.
Doss supports the Intel® Threading Analysis Tools and consults with strategic customers. She has also written papers and presented talks about the tools.
While threading can be a challenge, new software development tools help simplify the process by identifying thread correctness issues and performance opportunities. We will present a case study on threading a Black Scholes application—a common application in the financial services industry—to improve application performance and responsiveness. We will then look at a methodology used to successfully thread applications and valuable tools that support this threading methodology. |
|
| |
April 29 Registration required. |
Levent Akyil, Software Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
Fundamental Topics in the Design of Parallel Programs
Learn More |
Akyil has been with Intel for seven years, and held positions in the Digital Enterprise and Software and Solutions Groups. He worked on various Intel® Itanium®-based platforms and enterprise MP server projects before moving to his current role, where he provides technical consulting support for Intel® Software Developer Products. Akyil works with internal and external strategic customers providing enabling and optimization support on Intel® platforms. He has an M.S. in computer science and an MBA in technology and innovation management.
This seminar introduces several topics all developers of parallel code should know and be familiar with, such as parallel algorithm design, data and functional decomposition, various synchronization primitives, standard scheduling mechanisms, work stealing, cache coherence, and false sharing. |
|
| |
May 13 Registration required. |
Dr. Jayant DeSouza, Engineer, Performance, Analysis, and Threading Lab, Software and Solutions Group, Intel
Optimizing Parallel Programs: Symptoms to Solutions
Learn More |
Over the last four years at Intel, DeSouza's roles have included new product R&D, product lifecycle management, and strategic customer engagement. He has substantially improved production code performance for several Intel customers through use of Intel® Software Tools. DeSouza received his Ph.D. from the University of Illinois at Urbana-Champaign, where he specialized in parallel programming.
A systematic approach to symptom analysis can help solve multi-core performance and correctness issues. We will cover a set of symptoms and their possible causes. These include slowdown despite adding cores or increasing the number of threads, slowdown when giving threads larger pieces of work, slowdown due to scheduling issues, and slowdown/errors due to synchronization, etc. |
|
| |
May 27 Registration required. |
Ying Song, Technical Consulting Engineer for Intel® Integrated Performance Primitives, Intel Software Solutions Group
Boosting Performance of Imaging Solutions by Adopting New Deferred Mode Image Processing (DMIP) Layer
Learn More |
Song is responsible for consulting with applications developers on their use of Intel Integrated Performance Primitives libraries. She has worked for Intel for 11 years.
A typical image processing task handles data type conversion, filtering or threshold, one after another, and applies Intel® Integrated Performance Primitives (Intel® IPP) libraries without considering the order of calculations. With demands for dealing with larger images in complex image processing tasks, additional improvements for pipelined operations on images are more important. These improvements support better utilization in memory optimization and much faster performance on multi-threaded environments. We'll introduce a new implementation Deferred Mode Image Processing (DMIP) layer built on top of Intel IPP. We'll also share the latest performance benchmarks comparison for typical image processing tasks used in imaging solutions in medical and multimedia applications. |
|
| |
June 10 Registration required. |
Ganesh Rao, Intel Developer Products Division
Future Parallelization Technologies?
Learn More |
Ganesh has more than 15 years of experience in the areas of application tuning, benchmarking, and developer support. Ganesh has a broad array of applications experience, including computer games, enterprise applications and high performance computing environments. Currently Ganesh is focusing on training and supporting customers with development tools and benchmarks.
Whatif.intel.com hosts Intel's latest prototype products that explore new parallelization technologies, as well as community forums to discuss these ideas with other software technologists. Find out what Whatif.intel.com has to offer. |
|
| |
On-Demand Registration required. |
Herb Sutter, Software Architect, Microsoft
The Concurrency Revolution
Learn More |
Sutter is a chair of the ISO C++ standards committee. Among his books and papers is the widely-cited article "The Free Lunch Is Over" in which he coined the phrase "concurrency revolution" to describe the software sea change now in progress to exploit increasingly parallel hardware.
Although driven by the industry-wide shift to multi-core hardware architectures, concurrency is primarily a software revolution. We are now seeing the initial stages of the next major change in software development, as over the next few years the software industry brings concurrency pervasively into mainstream software development, just as it has done in the past for objects, garbage collection, generics, and other technologies. Sutter summarizes the issues involved, gives an overview of the impact, and describes what to expect over the coming decade. |
|
| |
On-Demand Registration required. |
James Reinders, Chief Evangelist of Intel Software Products
Steps to Parallelism NOW
Learn More |
Reinders is leading the development efforts for performance optimizing software tools including compilers, libraries, and threading analysis products. In 1989, Reinders joined Intel Corporation as a senior engineer and has contributed to projects including the world's first TeraFLOP supercomputer (ASCI Red), compilers, and architecture work for a number of Intel processors and parallel systems. Reinders is the author of the latest O'Reilly Nutshell book "Intel® Threading Building Blocks," a monthly columnist for the "The Gauntlet," found online at go-parallel.com, and the author of the book "VTune Performance Analyzer Essentials."
Practical advice on how to put parallelism in your programs today. Reinders will discuss how to pick the best approach and avoid common pitfalls, and will share his favorite rules of thumb on how to succeed with parallel programming. Reinders will set the context for the rest of the series, and explain how you can approach parallelism now in your applications. |
|
| |
On-Demand Registration required. |
Shwetha Doss, Technical Consulting Engineer at the Intel Performance Analysys and Threading Lab
Threading for Performance
Learn More |
Doss' current role involves supporting the Intel® Threading Analysis Tools and consulting with strategic customers. She has also written papers and given talks about Intel® Threading Analysis tools.
Doss will discuss some common performance issues specific to multithreaded applications and how these can be identified with analysis tools such as the VTune Performance Analyzer and Intel Thread Profiler and addressed with Intel® Threading Building Blocks (TBB). Intel TBB, a C++ template-based runtime library, consists of inherently scalable parallel algorithms and data structures that scale applications automatically as more processors and cores are detected in the underlying hardware. Intel TBB also helps address and fix performance issues that result from load imbalance and memory allocation. |
|
| |
On-Demand Registration required. |
Dr. Tim Mattson/James Reinders
Parallelism Programming Has Gone Mainstream: Are You Ready?
Learn More |
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
James Reinders is a senior engineer who joined Intel Corporation in 1989 and has contributed to projects including the world's first TeraFLOP supercomputer (ASCI Red), compilers and architecture work for the iWarp, Pentium® Pro, Pentium II, Itanium®, and Pentium® 4 processors. Reinders is currently the director of business development and marketing for Intel's Software Development Products and serves as the chief evangelist and spokesperson. He has been a leader in the creation of Intel's Software Products including product plans, support, technical marketing, marketing and business development. Reinders is also the editorial columnist for the monthly "The Gauntlet" at www.devX.go-parallel.com as well as the author of the Intel Press book titled "VTune™ Performance Analyzer Essentials" and contributor to the new book "Multi-Core Programming."
Tim Mattson will begin by discussing the trends driving parallel computing into the mainstream; with multi-core processors now and processors composed of many, heterogeneous cores in the future. He will show how Intel is uniquely positioned to help software developers make the transition from sequential to parallel software. Parallelizing software can be complex and for some, this will be a difficult transition. For those who are well prepared, however, parallel computing is an opportunity to get a jump on the competition. Tim will close with a quick look at algorithms used in parallel computing and some domains where it has proven particularly valuable. James Reinders will then move from theory to practice. He will discuss the methodologies, techniques, and tools Intel has in place to help software developers' transition to parallel computing. Finally James will introduce our spring webinar series where Intel's world-class programmers will do a deep dive on different facets of parallel computing and show what's new in the world of Intel's developer and threading tools. |
|
| |
On-Demand Registration required. |
Dr. Tim Mattson, Principle Engineer, Intel Application Research Laboratory
A Gentle Introduction to Parallel Software
Learn More |
Mattson joined Intel in 1993. Among his many roles, he was applications manager for the ASCI teraFLOPS project, helped create OpenMP, founded the Open Cluster Group (OSCAR), and launched Intel's programs in computing for the Life Sciences. Mattson earned a Ph.D. for his work on quantum molecular scattering theory (UCSC, 1985). This was followed by a postdoc at Caltech where he worked on the Caltech/JPL hypercubes. Currently, Mattson is conducting research on performance modeling for future multi-core microprocessors and how different programming models map onto these systems.
Dr. Tim Mattson, Principal Engineer at Intel's Microprocessor Technology Labs, will lead a webinar focused on actual code and the parallel programming APIs available to software developers. Tim will begin with an overview of the high level issues that apply to the task of creating a parallel program and then move on to consider the most commonly used parallel algorithms. He will then discuss the major parallel programming APIs (OpenMP*, MPI, and Windows* threads) showing how they are used with different algorithms and different platforms. After attending this webinar, developers should be conversant with major concurrent APIs and algorithms and be well positioned to start incorporating these techniques in their applications. |
|
| |
On-Demand Registration required. |
Gary Carleton
Software Performance Analysis for Multi-Core CPUs and Windows Vista*
Learn More |
Gary Carleton is a Senior Staff Software Engineer in the Performance Analysis Tools group at Intel Corp. He has been at Intel for 22 years and currently works on SW performance tools including the VTune™ Performance Analyzer. He has been an engineering manager and software engineer for Intel Corp, Cadre Technologies and Kaiser Engineers. He has a BS in Electrical Engineering and Computer Sciences from the University of California at Berkeley.
New operating systems continue to be introduced, while CPU cores multiply at a dizzying pace. This webinar will describe some of the special considerations presented by performance optimizations in multiple core environments on Microsoft's new Vista* operating system. We will demonstrate performance analysis using both low intrusive interrupt technology as well as instrumentation based techniques. Our primary tool in this will be the Intel® VTune™ Analyzer. Along the way we will discuss and show lesser known VTune™ Analyzer features focused on these environments. In particular, we will talk about our latest research into selecting the best CPU performance events to use with the VTune™ Analyzer's Event Based Sampling feature in identifying exactly what operations are slowing down the processor. |
|
| |
On-Demand Registration required. |
Dr. David Mackay
Three Steps to Threading and Performance Part 1 - Thread Correctness: Maintaining Deterministic Results in Developing, Maintaining and Tuning Threaded Software
Learn More |
Dr. David Mackay is the technical lead for the Performance Analysis and Threading Tools consulting engineers team. David has been working with Intel® Threading Tools since joining the software products division in 2001. David joined Intel 1992 as part of the Supercomputer Systems Division. He has been working on software optimization since then. David Mackay received his Ph.D. from Stanford University.
Part 1 of Three Steps to Threading and Performance discusses recommended techniques for managing unique correctness challenges in developing, maintaining and tuning threaded software. In this webinar, Dr. David Mackay discusses the techniques and processes needed to manage this effectively across groups. He presents a quick overview of Intel® Thread Checker, a product that finds challenging data races and deadlocks, and also has advanced features for development, debugging, tuning and maintenance. Attendees at this webinar will receive a grounding in the practices and techniques for developing, maintaining and tuning high quality threaded software. |
|
| |
On-Demand Registration required. |
Victoria Gromova
Three Steps to Threading and Performance Part 2 - Expressing Parallelism: Case Studies with Intel® Threading Building Blocks
Learn More |
Victoria Gromova works as a Senior Software Engineer at Intel Corporation. She contributed to design and development of tools for multi-threaded application performance analysis, such as Intel® Thread Profiler. Victoria joined the Performance Analysis and Threading Tools consulting team in early 2006 and has worked on many threading projects, including threading open source libraries for simulating rigid body dynamics.
Part 2 of Three Steps to Threading and Performance examines major threading methodologies and paradigms. Victoria Gromova will discuss Intel® Threading Building Blocks, a C++ template-based runtime library. Building on the strengths of familiar programming tools such as the Standard Template Library, Intel® Threading Building Blocks provides generic parallel containers, idioms and paradigms to express parallelism without managing the threads yourself. Victoria will demonstrate the strength of generic-based coding for concurrency, performance and scalability through its application to a complex games physics engine library. She will briefly contrast OpenMP*, OS threads and Intel® Threading Building Blocks tasks. Attendees at this webcast will understand why, when and how to begin using the Intel® Threading Building Blocks to introduce concurrency into their serial code. |
|
| |
On-Demand Registration required. |
Vasanth Tovinkere
Three Steps to Threading and Performance Part 3 - Tuning Threaded Software: Next Steps After Concurrency
Learn More |
Vasanth Tovinkere is a Senior Staff Engineer at the Intel Performance, Analysis and Threading Lab in the Developer Products Division (DPD). His current role involves supporting the Intel® Threading Tools and consulting with strategic customers through the Threading Immersion Program. He has also been involved in the development of automatic semantic event detectors for digital sports technologies in Intel Labs. His research interests include data mining of performance and trace data and fuzzy inference engines. Vasanth began his career at Intel in 1997 as an engineer where he researched threading behavior and performance and worked with early adopters in Wall Street to enable them for multi-processor architectures. Prior to joining Intel, he was involved in the development of automated fuzzy pattern recognition algorithms for NASA's Mission to Planet Earth Program.
Part 3 of Three Steps to Threading and Performance addresses optimization issues faced by developers after getting their threaded applications to run correctly and to produce deterministic results. Vasanth Tovinkere suggests two relatively simple but crucial steps for creating optimized threaded code. The first step is an inspection of the software architecture and thread interactions. This is key to overcoming the performance degradation often encountered in newly threaded applications. Next, we discuss how developers can further analyze the behavior of the application on a given platform and understand the runtime implications on the platform architecture. Our primary tools for both are the VTune™ Analyzer and the Intel® Thread Profiler. After attending this webinar, developers should be able to understand why threaded code sometimes has performance issues, how to isolate the problem as well as how to optimize the code using VTune™ and Intel® Thread Profiler. |
|
| |
On-Demand Registration required. |
Joe Wolf
Using Intel® C++ and Fortran Compilers, Version 10.0 for Performance, Multithreading, and Security
Learn More |
Joe Wolf has worked with compilers and tools for applications from supercomputers to the desktop for over twenty years as a developer, in technical support and training for Intel's software tools, and now as a manager of the Intel® Compilers Technical Consulting and Support Team. He has specialized in vectorizing and parallelizing compilers, as well as multithreading.
Learning how to introduce parallelism and optimizations within serial code is not easy. In this installment of Intel's multi-core webinar series, you will learn how to take advantage of Intel's new parallel optimization technology to allow you to vectorize the inner loops and parallelize outer loops without conflicts for both auto-parallelization and OpenMP*. We will also cover how to find some common, but annoying, security issues and coding errors in C & C++ with our new verification features. |