Loading... ## 声明 因为mnist的数据集是公开的,为了避免把keggle 的测试集搞到训练里面去 我们使用kaggle和keras提供的训练集训练,用他们的测试数据集测试! 如果你想在kaggle MNIST中达到1.00 这很简单,去网上把mnist的所有数据集下载,带到模型里面去训练,达到过拟合后,去预测kaggle的训练数据,就可以达到1.00了! 但是这对学习来说没有多少意义..... ## 数据集准备 1.keras中自带,内置的60000个训练数据和10000个测试数据 2.kaggle - MNIST [https://www.kaggle.com/c/digit-recognizer/data](https://www.kaggle.com/c/digit-recognizer/data "https://www.kaggle.com/c/digit-recognizer/data") 42000个训练数据和28000个测试数据 注:kaggle中的测试数据没有标签 ## 模型的准备 在这里分第一期和第二期来准备 ### 密集全连接层 ![密集全连接层](http://blog.a152.top/usr/uploads/2021/08/1106269163.png "密集全连接层") ### 带有卷积池化的连接层-CNN ![请输入图片描述](http://blog.a152.top/usr/uploads/2021/08/603253698.png) ## 用代码实现模型 <div class="tab-container post_tab box-shadow-wrap-lg"> <ul class="nav no-padder b-b scroll-hide" role="tablist"> <li class='nav-item active' role="presentation"><a class='nav-link active' style="" data-toggle="tab" aria-controls='tabs-f5b1f702697dfe80fc7e470cde97cc53430' role="tab" data-target='#tabs-f5b1f702697dfe80fc7e470cde97cc53430'>全连接层</a></li><li class='nav-item ' role="presentation"><a class='nav-link ' style="" data-toggle="tab" aria-controls='tabs-6851594a020e91a7545658a6132d2af0141' role="tab" data-target='#tabs-6851594a020e91a7545658a6132d2af0141'>CNN</a></li> </ul> <div class="tab-content no-border"> <div role="tabpanel" id='tabs-f5b1f702697dfe80fc7e470cde97cc53430' class="tab-pane fade active in"> 下面的模型经过进一步修改,图片可能与它不符 ```python from keras import models from keras import layers network = models.Sequential() network.add(layers.Dense(2048, activation='relu', input_shape=(28 * 28,))) network.add(layers.Dropout(0.1)) network.add(layers.Dense(512, activation='relu')) network.add(layers.Dropout(0.1)) network.add(layers.Dense(512, activation='relu')) network.add(layers.Dropout(0.1)) network.add(layers.Dense(512, activation='relu')) network.add(layers.Dense(10, activation='softmax')) #network.summary() ``` </div><div role="tabpanel" id='tabs-6851594a020e91a7545658a6132d2af0141' class="tab-pane fade "> 还在写作中...</div> </div> </div> ## 数据预处理 <div class="tab-container post_tab box-shadow-wrap-lg"> <ul class="nav no-padder b-b scroll-hide" role="tablist"> <li class='nav-item active' role="presentation"><a class='nav-link active' style="" data-toggle="tab" aria-controls='tabs-390004da9470204fc49cd69e8f2f2a01700' role="tab" data-target='#tabs-390004da9470204fc49cd69e8f2f2a01700'>全连接层</a></li><li class='nav-item ' role="presentation"><a class='nav-link ' style="" data-toggle="tab" aria-controls='tabs-d323c61c36e5464aa2e178d42ffa5ea0351' role="tab" data-target='#tabs-d323c61c36e5464aa2e178d42ffa5ea0351'>CNN</a></li> </ul> <div class="tab-content no-border"> <div role="tabpanel" id='tabs-390004da9470204fc49cd69e8f2f2a01700' class="tab-pane fade active in"> <div class="panel panel-default collapse-panel box-shadow-wrap-lg"><div class="panel-heading panel-collapse" data-toggle="collapse" data-target="#collapse-d9bf65ad217dcacc707b25eadea620d252" aria-expanded="true"><div class="accordion-toggle"><span style="">keras内置的数据集</span> <i class="pull-right fontello icon-fw fontello-angle-right"></i> </div> </div> <div class="panel-body collapse-panel-body"> <div id="collapse-d9bf65ad217dcacc707b25eadea620d252" class="collapse collapse-content"><p></p> ### 加载数据集 ```python from keras.datasets import mnist (train_images, train_labels), (test_images, test_labels) = mnist.load_data() ``` ### 处理训练数据和测试数据图像 ```python train_images = train_images.reshape((60000, 28 * 28)) train_images = train_images.astype('float32') / 255 test_images = test_images.reshape((10000, 28 * 28)) test_images = test_images.astype('float32') / 255 print(train_images.shape) ``` 经过上面处理,我们已经处理成模型input一样的形状了 接下来是处理训练标签和测试标签 ```python from keras.utils import to_categorical train_labels = to_categorical(train_labels) test_labels = to_categorical(test_labels) ``` <p></p></div></div></div> <div class="panel panel-default collapse-panel box-shadow-wrap-lg"><div class="panel-heading panel-collapse" data-toggle="collapse" data-target="#collapse-a88174858c4d272ba09897f1cf6511dc70" aria-expanded="true"><div class="accordion-toggle"><span style="">kaggle下载的数据集</span> <i class="pull-right fontello icon-fw fontello-angle-right"></i> </div> </div> <div class="panel-body collapse-panel-body"> <div id="collapse-a88174858c4d272ba09897f1cf6511dc70" class="collapse collapse-content"><p></p> ### 读取数据集 处理训练图像 ```python df = pd.read_csv('train.csv') data = df.drop(['label'],axis=1) train_images = data.values.reshape(42000,28*28) train_images = train_images.astype('float32') / 255 train_images.shape ``` ### 处理训练标签 ```python label = df['label'] from keras.utils import to_categorical train_labels = label train_labels = to_categorical(label) train_labels.shape ``` <p></p></div></div></div> ### 合并两个数据集 ```python train_data = np.concatenate([train_images,ktrain_images],axis=0) train_label = np.concatenate([train_labels,ktrain_labels]) print(train_data.shape) print(train_label.shape) ``` </div><div role="tabpanel" id='tabs-d323c61c36e5464aa2e178d42ffa5ea0351' class="tab-pane fade "> 内容 2</div> </div> </div> ## 开始训练! <div class="tab-container post_tab box-shadow-wrap-lg"> <ul class="nav no-padder b-b scroll-hide" role="tablist"> <li class='nav-item active' role="presentation"><a class='nav-link active' style="" data-toggle="tab" aria-controls='tabs-2f232ce55586650c5153488b8c10171c480' role="tab" data-target='#tabs-2f232ce55586650c5153488b8c10171c480'>DNN</a></li><li class='nav-item ' role="presentation"><a class='nav-link ' style="" data-toggle="tab" aria-controls='tabs-0a5d3eb05c130a754c75c5c5f1fec17721' role="tab" data-target='#tabs-0a5d3eb05c130a754c75c5c5f1fec17721'>CNN</a></li> </ul> <div class="tab-content no-border"> <div role="tabpanel" id='tabs-2f232ce55586650c5153488b8c10171c480' class="tab-pane fade active in"> ```python network.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy']) network.fit(train_data, train_label, epochs=50, batch_size=5000) ``` 在50个epoch后,将会在这些数据集达到过拟合 学习方法可以用多个,比如rmsprop 学习完成后,记得保存模型 ```python network.save('mnistdnn814.h5') ``` 用keras的测试集进行检测 ```python test_loss, test_acc = network.evaluate(ktest_images, ktest_labels) print('test_acc:', test_acc) ``` 预测kaggle的测试数据集 ```python test_data = pd.read_csv('test.csv') test_data = test_data.values.reshape(28000,28*28) test_data = test_data / 255 result = model.predict(test_data) result = np.argmax(result, axis=1) r = {"ImageId": [i+1 for i in range(28000)], "Label": result} r = pd.DataFrame(r) r.to_csv('result.csv',index=False) print("成功了!结果保存在result.csv!") ``` </div><div role="tabpanel" id='tabs-0a5d3eb05c130a754c75c5c5f1fec17721' class="tab-pane fade "> 内容 2</div> </div> </div> <div class="panel panel-default box-shadow-wrap-lg goal-panel"> <div class="panel-heading"> 完成进度 </div> <div class="list-group"> <div class="list-group-item"> <p class="goal_name"> 还在进行中 :</p> <div class="progress-striped active m-b-sm progress" value="dynamic" type="danger"> <div class="progress-bar progress-bar-warning" role="progressbar" aria-valuenow="97" aria-valuemin="0" aria-valuemax="100" style="width: 50%;"><span> 50% </span></div> </div></div></div></div> Last modification:August 26, 2021 © Allow specification reprint Support Appreciate the author AliPayWeChat Like 如果觉得我的内容对你有用,请随意赞赏