NSF CyberTraining Program: Linear Algebra Preparation for Emergent Neural Network Architectures (LAPENNA)
“This (NSF CyberTraining) program seeks to prepare, nurture, and grow the national scientific research workforce for creating, utilizing, and supporting advanced cyberinfrastructure (CI) to enable and potentially transform fundamental science and engineering research and contribute to the Nation’s overall economic competitiveness and security. The goals of this solicitation are to (i) ensure broad adoption of CI tools, methods, and resources by the research community in order to catalyze major research advances and to enhance researchers’ abilities to lead the development of new CI; and (ii) integrate core literacy and discipline-appropriate advanced skills in advanced CI as well as computational and data-driven science and engineering into the Nation’s educational curriculum/instructional material fabric spanning undergraduate and graduate courses for advancing fundamental research.”
— from NSF CyberTraining Program Solicitation
Nature of the LAPENNA CyberTraining Program
We’ve organized our LAPENNA program to provide essential knowledge to advance literacy in AI in order to sustain the growth and development of the workforce in the cyberinfrastructure (CI) ecosystem on data-driven science. This program provides integrated expertise to faculty, students, and researchers to enhance knowledge in numerical mathematics, linear algebra software, data-driven methods, and machine learning tools in order to tackle day to day problems in data science applications. This program also aims to prepare college researchers to enable, design, and direct their own in-house data-driven science programs and incorporate perspectives from their research into their course curricula. The knowledge and experiences gathered under the direction of LAPENNA will be beneficial to CI practitioners as well general interested parties leading to fostering new collaborative partners and potential research initiatives and workforce training programs.
LAPENNA at NICS is focused to deliver algorithmic and computational techniques, numerical and programming procedures, and AI software implementation on emergent CPU cloud systems and GPU platforms. It runs two training sessions every year. We will deliver ten webinars/lectures with supporting online tutorials available for general public use. Eight teams of faculty/researchers and students are selected to participate in the LAPENNA program each session. Each team consists of two members from an institution. The training for each cohort will last for six months and conclude with an on-site, one-week workshop. Follow-up Q & A sessions will connect the college teams and PIs during and after the training events and continue to provide hardware and software support to them. LAPENNA will deliver online materials that will be useful and available to all general CI practitioners.
This training program consists of the following four integrated parts:
1) The first part is mainly introductory trainings with webinar lectures and hands-on exercises. The training materials will consist of general workstation operation, typical programming languages, introduction to parallel computing, software tools, and open source libraries with emphasis in utilizing GPU for data-driven science and AI applications.
2) The second part is composed of sets of comprehensive lectures and hands-on exercises of knowledge in mathematics of data-driven sciences, machine learning, numerical linear algebra, and programming knowledge to prepare for solving Team’s assigned projects. Example projects in image processing problem will be introduced.
3) The third part is hands-on implementation of DNN on a GPU system related to solving the assigned use case study under the guidance of the PIs.
Benchmarks on HPC systems are examined.
4) The last part is to integrate materials to the undergraduate curriculum of the team’s college, as tools or projects in a academics course in the departments of their choice.
We are currently recruiting faculty and researchers to participate in the LAPENNA program in the Fall 2021 session. Please contact Kwai Wong for additional information. Tentative schedule of the lecture series is available at Fall 2021.
Content and Schedule of the LAPENNA CyberTraining Program
Each session lasts for six months and is composed of eight college Teams. Each Team has two members consisting of two faculty/researchers or one faculty with a student from their institution. The training program is organized around a project-driven environment using data from use cases that have been developed by Dr. Kwai Wong of the Joint Institute of Computational Sciences (JICS) and Dr. Stan Tomov of the Innovative Computing Laboratory (ICL). Example projects primarily include image processing, traffic flow, materials science, and benchmarking problems. An autonomous vehicle testbed is also used to demonstrate how a practical application is resolved in a stepwise procedure on a Jetson Nano GPU.
Listed below are the major components of the LAPENNA training program:
1) Basic Linear Algebra Subprogram (BLAS) and it variants on CPU and GPU
2) LAPACK and MAGMA libraries on CPU and GPU
3) Batched linear algebra and tensor computation on GPU
4) Performance measurement and benchmarks for multicore CPU and GPU
5) Numerical implementation of Multilayer Perceptron and Convolution Neural Network (CNN) on CPU and GPU
6) Mixed precision computation using GPU Tensor Core acceleration
7) Parallel computing for Deep Neural Network using multiple GPUs
8) Profiling and benchmarks of neural network problems on CPU and GPU
The lecture materials of the webinars are listed as follows.
1) Mathematical modeling basics and computational science, introduction to big data analytics, numerical linear algebra
2) Scientific computing, programming overview: C, python, R, scripts, compiling and running code projected example projects.
3) Numerical linear algebra libraries, performance, Machine learning, deep learning framework, GPU programming technique
4) Data science training, python modules, profiling tools, statistical Data modeling, workflow, mathematics of DNN, TensorFlow
5) CUDA, CuBLAS, CuDNN, TensorFlow, deep learning model and methods, usage, profiling on GPU, Tensor Core, DNN on GPU
6) GPU software library, data library, machine learning, hyperparameters , AI framework, MagmaDNN, TensorFlow, Keras, MNIST benchmark
7) Big data tools and framework, I/O essentials, implementation, Image model and big data application, coding illustration
8) Building Neural Network Program using MagmaDNN, tensor computation
9) MagmaDNN, Tensor, CNN, RNN, TensorFlow, usage of multiple GPUs, Image classification,
10) Mixed precision and parallel computation, applications project presentations.
The agenda of the one-week in-house workshop is listed as follow. The one-week workshop will be held at the University of Tennessee. Travel expenses and stipends will be provided to the participants.
Day 1: Big Data and Modeling in Imaging Processing
Day 2: Workflow, Example Application Discussion, Problem Solving
Day 3: Problem Solving Code Deep Dive
Day 4: Problem Solving, Challenges, Suggestions
Day 5: Finalize Future Plan and machine learning courses and agenda
LAPENNA CyberTraining Program