Index
search
Quick search
Courses
D2L Java Notebooks
Slack
DJL GitHub
中文版
Table Of Contents
Preface
Installation
Notation
1. Introduction
2. Preliminaries
2.1. Data Manipulation
2.2. Data Preprocessing
2.3. Linear Algebra
2.4. Calculus
2.5. Automatic Differentiation
2.6. Probability
2.7. Documentation
3. Linear Neural Networks
3.1. Linear Regression
3.2. Linear Regression Implementation from Scratch
3.3. Concise Implementation of Linear Regression
3.4. Softmax Regression
3.5. The Image Classification Dataset
3.6. Implementation of Softmax Regression from Scratch
3.7. Concise Implementation of Softmax Regression
4. Multilayer Perceptrons
4.1. Multilayer Perceptrons
4.2. Implementation of Multilayer Perceptron from Scratch
4.3. Concise Implementation of Multilayer Perceptron
4.4. Model Selection, Underfitting and Overfitting
4.5. Weight Decay
4.6. Dropout
4.7. Forward Propagation, Backward Propagation, and Computational Graphs
4.8. Numerical Stability and Initialization
4.9. Considering the Environment
5. Deep Learning Computation
5.1. Layers and Blocks
5.2. Parameter Management
5.3. Custom Layers
5.4. File I/O
5.5. GPUs
6. Convolutional Neural Networks
6.1. From Dense Layers to Convolutions
6.2. Convolutions for Images
6.3. Padding and Stride
6.4. Multiple Input and Output Channels
6.5. Pooling
6.6. Convolutional Neural Networks (LeNet)
7. Modern Convolutional Neural Networks
7.1. Deep Convolutional Neural Networks (AlexNet)
7.2. Networks Using Blocks (VGG)
7.3. Network in Network (NiN)
7.4. Networks with Parallel Concatenations (GoogLeNet)
7.5. Batch Normalization
7.6. Residual Networks (ResNet)
7.7. Densely Connected Networks (DenseNet)
8. Recurrent Neural Networks
8.1. Sequence Models
8.2. Text Preprocessing
8.3. Language Models and the Dataset
8.4. Recurrent Neural Networks
8.5. Implementation of Recurrent Neural Networks from Scratch
8.6. Concise Implementation of Recurrent Neural Networks
8.7. Backpropagation Through Time
9. Modern Recurrent Neural Networks
9.1. Gated Recurrent Units (GRU)
9.2. Long Short-Term Memory (LSTM)
9.3. Deep Recurrent Neural Networks
9.4. Bidirectional Recurrent Neural Networks
9.5. Machine Translation and the Dataset
9.6. Sequence to Sequence Learning
9.7. Beam Search
10. Attention Mechanisms
10.1. Attention Cues
10.2. Attention Scoring Functions
10.3. Multi-Head Attention
10.4. Self-Attention and Positional Encoding
11. Optimization Algorithms
11.1. Optimization and Deep Learning
11.2. Convexity
11.3. Gradient Descent
11.4. Stochastic Gradient Descent
11.5. Minibatch Stochastic Gradient Descent
11.6. Momentum
11.7. Adagrad
11.8. RMSProp
11.9. Adadelta
11.10. Adam
11.11. Learning Rate Scheduling
12. Computational Performance
12.1. Automatic Parallelism
12.2. Hardware
12.3. Concise Implementation for Multiple GPUs
12.4. Parameter Servers
13. Computer Vision
13.1. Object Detection and Bounding Boxes
13.2. Anchor Boxes
13.3. Multiscale Object Detection
13.4. The Object Detection Dataset
14. Natural Language Processing: Pretraining
14.1. Word Embedding (word2vec)
14.2. Approximate Training
14.3. The Dataset for Pretraining Word Embedding
14.4. Word Embedding with Global Vectors (GloVe)
14.5. Subword Embedding
References
Table Of Contents
Preface
Installation
Notation
1. Introduction
2. Preliminaries
2.1. Data Manipulation
2.2. Data Preprocessing
2.3. Linear Algebra
2.4. Calculus
2.5. Automatic Differentiation
2.6. Probability
2.7. Documentation
3. Linear Neural Networks
3.1. Linear Regression
3.2. Linear Regression Implementation from Scratch
3.3. Concise Implementation of Linear Regression
3.4. Softmax Regression
3.5. The Image Classification Dataset
3.6. Implementation of Softmax Regression from Scratch
3.7. Concise Implementation of Softmax Regression
4. Multilayer Perceptrons
4.1. Multilayer Perceptrons
4.2. Implementation of Multilayer Perceptron from Scratch
4.3. Concise Implementation of Multilayer Perceptron
4.4. Model Selection, Underfitting and Overfitting
4.5. Weight Decay
4.6. Dropout
4.7. Forward Propagation, Backward Propagation, and Computational Graphs
4.8. Numerical Stability and Initialization
4.9. Considering the Environment
5. Deep Learning Computation
5.1. Layers and Blocks
5.2. Parameter Management
5.3. Custom Layers
5.4. File I/O
5.5. GPUs
6. Convolutional Neural Networks
6.1. From Dense Layers to Convolutions
6.2. Convolutions for Images
6.3. Padding and Stride
6.4. Multiple Input and Output Channels
6.5. Pooling
6.6. Convolutional Neural Networks (LeNet)
7. Modern Convolutional Neural Networks
7.1. Deep Convolutional Neural Networks (AlexNet)
7.2. Networks Using Blocks (VGG)
7.3. Network in Network (NiN)
7.4. Networks with Parallel Concatenations (GoogLeNet)
7.5. Batch Normalization
7.6. Residual Networks (ResNet)
7.7. Densely Connected Networks (DenseNet)
8. Recurrent Neural Networks
8.1. Sequence Models
8.2. Text Preprocessing
8.3. Language Models and the Dataset
8.4. Recurrent Neural Networks
8.5. Implementation of Recurrent Neural Networks from Scratch
8.6. Concise Implementation of Recurrent Neural Networks
8.7. Backpropagation Through Time
9. Modern Recurrent Neural Networks
9.1. Gated Recurrent Units (GRU)
9.2. Long Short-Term Memory (LSTM)
9.3. Deep Recurrent Neural Networks
9.4. Bidirectional Recurrent Neural Networks
9.5. Machine Translation and the Dataset
9.6. Sequence to Sequence Learning
9.7. Beam Search
10. Attention Mechanisms
10.1. Attention Cues
10.2. Attention Scoring Functions
10.3. Multi-Head Attention
10.4. Self-Attention and Positional Encoding
11. Optimization Algorithms
11.1. Optimization and Deep Learning
11.2. Convexity
11.3. Gradient Descent
11.4. Stochastic Gradient Descent
11.5. Minibatch Stochastic Gradient Descent
11.6. Momentum
11.7. Adagrad
11.8. RMSProp
11.9. Adadelta
11.10. Adam
11.11. Learning Rate Scheduling
12. Computational Performance
12.1. Automatic Parallelism
12.2. Hardware
12.3. Concise Implementation for Multiple GPUs
12.4. Parameter Servers
13. Computer Vision
13.1. Object Detection and Bounding Boxes
13.2. Anchor Boxes
13.3. Multiscale Object Detection
13.4. The Object Detection Dataset
14. Natural Language Processing: Pretraining
14.1. Word Embedding (word2vec)
14.2. Approximate Training
14.3. The Dataset for Pretraining Word Embedding
14.4. Word Embedding with Global Vectors (GloVe)
14.5. Subword Embedding
References
Index