Thank you for sending your enquiry! One of our team members will contact you shortly.
Thank you for sending your booking! One of our team members will contact you shortly.
Course Outline
Overview of CANN Optimization Capabilities
- Approaches to managing inference performance within CANN.
- Optimization objectives for edge and embedded AI systems.
- Insights into AI Core utilization and memory allocation mechanisms.
Leveraging the Graph Engine for Analysis
- Introduction to the Graph Engine and its execution pipeline.
- Visualizing operator graphs and runtime metrics.
- Adjusting computational graphs to facilitate optimization.
Profiling Tools and Performance Metrics
- Employing the CANN Profiling Tool (profiler) for workload analysis.
- Evaluating kernel execution time and identifying bottlenecks.
- Performing memory access profiling and exploring tiling strategies.
Custom Operator Development with TIK
- Overview of TIK and the operator programming model.
- Building a custom operator using TIK DSL.
- Testing and benchmarking operator performance.
Advanced Operator Optimization with TVM
- Introduction to TVM integration with CANN.
- Auto-tuning strategies for computational graphs.
- Guidance on when and how to transition between TVM and TIK.
Memory Optimization Techniques
- Managing memory layout and buffer placement.
- Methods to decrease on-chip memory consumption.
- Best practices for asynchronous execution and resource reuse.
Real-World Deployment and Case Studies
- Case study: Performance tuning for a smart city camera pipeline.
- Case study: Optimizing the inference stack for autonomous vehicles.
- Guidelines for iterative profiling and continuous improvement.
Summary and Next Steps
Requirements
- Solid grasp of deep learning model architectures and training workflows.
- Hands-on experience deploying models via CANN, TensorFlow, or PyTorch.
- Proficiency in Linux CLI, shell scripting, and Python programming.
Target Audience
- AI performance engineers.
- Inference optimization specialists.
- Developers focused on edge AI or real-time systems.
14 Hours