LOCALIZAÇÃO
Estadio Wanda Metropolitano | Madrid, España
Temos o prazer de convidar-lhe para o encontro anual de profissionais Intel Software 2017. Venha conhecer novas possibilidades para a sua organização, no novo estádio do Atlético de Madrid, nos dias 7 e 8 de Novembro 2017.
Nestas sessões anteciparemos novas possibilidades no campo da Inteligência Artificial e Deep Learning. Mostrando-lhe também, como programar e modernizar o seu código com as últimas ferramentas da Intel, para os processadores mais recentes.
Não perda a oportunidade de participar no Encontro de Modernização de Código da Intel 2017, imprescindível para desenvolvedores de software, arquitetos, gestores de projetos e investigadores.
Um dia de sessões técnicas de informática técnica, HPC, programação paralela, vectorização, e otimização do rendimento em plataformas Intel para desenvolvedores C/C++, Fortan e Python.
Na 2º jornada das sessões técnicas estarão relacionadas com a Inteligência Artificial, vendo como as tecnologias podem ser implantadas em futuros CPUs e como escalar o rendimento a qualquer carga de trabalho.
08:15 – 09:15 Registration with light breakfast
09:15 – 09:30 Welcome & Introduction
09:30 – 10:15 Parallelism, Performance & Optimization on Intel Architecture
Starting with a brief overview of the latest Intel silicon roadmap we look at how you can use Intel Parallel Studio XE 2018 to get
best performance on both the new Intel® Xeon® Scalable Processors (Purley / Skylake-SP) as well as the Intel® Xeon Phi™
processor family (Knights Landing and Knights Mill). We then discuss three key topics (Vectorization with AVX512, Threading, and
Memory) that you need to address when modernizing code.
Stephen Blair-Chappell
10:15 – 11:15 Practical Session 1: Using Intel Parallel Studio to answer the question “why is my program running so slow”?
In this session, we use three Intel tools, Intel® Trace Analyzer and Collector, Intel® VTune Amplifier XE, and Intel® Vectorization
Advisor to track down the reasons for slow running code in a Lattice Quantum Chromodynamics (LQCD) code. The example is
based on a real problem reported by the HPC community.
Stephen Blair-Chappell
11:15 – 11:45 Coffee break
11:45 – 12:30 Striding Towards Perfection – A step-by-step narrative on optimizing the k-means algorithm
A look at how code modernization techniques are being used in the scientific community to produce code that takes best
advantage of the latest generation of CPU hardware. In this session we improve the performance of the k-mean clustering
algorithm written in C++ by first working on the vectorization followed by improving the threading of the code. The final version
is benchmarked on latest generation of Intel Xeon and Intel Xeon Phi.
12:30 – 13:30 Practical Session 2: Tuning Vectorized Code Using Intel Vector Advisor
In this session, we show how to use Intel Vector Advisor to check how well your code is being vectorized and using the latest architecture available such as AVX512. Additionally, we look at various memory issues, such as non-contiguous memory accesses and unite stride vs. non-unit stride accesses, and how eliminating such issues can lead to significant speed up of vectorized code and improve the quality of code generated automatically by the compiler.
13:30 – 14:30 Lunch break
14:30 – 15:30 Using the Intel Compiler to create portable applications
In this session we take a close look at how you can use the Intel compiler to bring performance and portability to your vectorized
applications. We show how you can take full advantage of the latest instructions sets – such as AVX512 – and yet create
programs that can still safely run on earlier generations of CPU.
Additionally, we describe some of the recent compiler options supported by the latest version the Intel compiler that improve the
reproducibility of floating point results.
15:30 – 15:45 Coffe break
15:45 – 16:30 Optimizing Python Code using the Intel Distribution of Python
It used to be the case that you would never use the words ‘performance’ and ‘python in the same sentence. The Intel distribution
of Python changes all that. In this first of a two-part session we show how you can speed up you Python codes using by
‘Cythonising’ your Python code to achieve native performance.
Stephen Blair-Chappell
16:30 – 17:30 Practical Session 3: Roofline analysis Intel Vector Advisor
Learn how to run a Roofline Analysis using Intel Vector Advisor. The Roofline model combines locality, bandwidth, and different
parallelization paradigms into a single performance figure that shows the performance of the code under test.
17:30 – 19:00 – Networking Cocktail with drinks and finger food
08:15 – 09:15 Registration with light breakfast
09:15 – 09:30 Welcome & Introduction
09:30 – 10:15 AI Concepts and Use Cases
In this session, we will explore the concepts and applications of Deep Learning, with a focus on real world applications using the
Intel CPUs for training and inference.
Kiran Kotaru
10:15 – 10:45 Introducing the new Intel CPU Generation For AI
This session will introduce the architectural details and the key features of the latest Intel server CPUs from a software
development and AI perspective. We will cover both the new Intel® Xeon® Scalable Processors (Purley / Skylake-SP) as well as the
Intel® Xeon Phi™ processor family (code name Knights Landing and Knights Mill).
Ralph de Wargny
10:45 – 11:15 Intel Nervana Software Stack – Overview & Implementation
This session will cover Intel Nervana’s software stack for AI, Machine Learning and Deep Learning: from low-level libraries like
MKL / MKL-DNN, CPU-optimized frameworks (incl. neon, Caffe, TensorFlow, Theano), development tools like VTune, the Intel
Python distribution, to the new Intel® Nervana™ Graph library (ngraph).
11:15 – 11:45 Coffee Break
11:45 – 13:00 Practical Frameworks Session 1: Using Optimized Caffe Framework
In this session we show how to build Caffe optimized for Intel architecture, train deep network models using one or more
compute nodes, and deploy networks. In addition, various functionalities of Caffe are explored in detail including how to finetune,
extract and view features of different models, and use the Caffe Python API.
Walter Riviera
13:00 – 14:00 Optimizing Python Code using the Intel Distribution of Python
It used to be the case that you would never use the words ‘performance’ and ‘python in the same sentence. The Intel distribution
of Python changes all that. In this second of a two-parts’ session we show how you can speed up you Python codes ‘out-of-thebox’
by using the Intel distribution of python. In this session we use the Intel optimized version of SciKit-Learn.
Stephen Blair-Chappell
14:00 – 15:00 Lunch Break
15:00 – 15:30 Case Study Manufacturing package fault detection using Deep Learning
A proof of concept focused on adopting deep-learning technology based on Caffe* for manufacturing package fault detection.
Stephen Blair-Chappell
15:30 – 16:15 Practical frameworks session 2: Using Tensorflow
In this tutorial we show how to use the Intel-optimized version of TensorFlow hosted on the high-level neural networks library
Keras. As well as demonstrating of how to use these frameworks, the session will include a ‘live’ VTune analysis of the
frameworks and an explanation of how the Intel implemented optimizations were achieved.
Vishnu Madhu
16:15 – 16:30 Coffee break
16:30 – 17:15 DL Inference using the movidius “neural” compute stick
Learn how trained models can be optimized for inference using innovative Movidius™ Neural Compute Stick.
Stephen Blair-Chappell / Walter Riviera
17:15 – 17:30 Q&A and closing comments
17:30 – 18:30 Optional: guided tour of the Wanda Metropolitano, Atlético Madrid Stadium
Deixe uma resposta
Quer juntar-se ao debate?Sinta-se livre para contribuir!