ãã£ãŒãã©ãŒãã³ã°ã®æŠèŠ
Contents
6. ãã£ãŒãã©ãŒãã³ã°ã®æŠèŠÂ¶
ãã£ãŒãã©ãŒãã³ã° ã¯ã æ©æ¢°åŠç¿ã®ã«ããŽãªã®äžã€ã§ãããŸãæ©æ¢°åŠç¿ã¯ 人工ç¥èœ ïŒAIïŒ ã®ã«ããŽãªã®äžã€ã§ãããã£ãŒãã©ãŒãã³ã°ã¯ããã¥ãŒã©ã«ãããã¯ãŒã¯ãçšããŠååž°ãåé¡ã®ãããªæ©æ¢°åŠç¿ãå®çŸããŸãããŸããã®ç« ã§ã¯ãã£ãŒãã©ãŒãã³ã°ã®æŠèŠã玹ä»ãããã®åŸã®ç« ã§ããã«è©³çŽ°ã«èžã¿èŸŒãã§ãããŸãã
èªè ãš & ç®ç
æ¬ç« 㯠Regression & Model Assessment ããã³ æ©æ¢°åŠç¿å ¥éã®å 容ãåæãšããŸããæ¬ç« ãéããŠãããªãã¯æ¬¡ã®ããšãåŠã¶ã§ãããïŒ
ãã£ãŒãã©ãŒãã³ã°ã®å®çŸ©
ãã¥ãŒã©ã«ãããã¯ãŒã¯ã®å®çŸ©
ãããŸã§ã«åŠãã ååž°ã®åçã®ãã¥ãŒã©ã«ãããã¯ãŒã¯ãžã®å¿çš
ãã®æ¬ã®ç®çã¯ãååŠããã³ææç§åŠãäžå¿ãšãããã£ãŒãã©ãŒãã³ã°ã®å ¥éæžãšãªãããšã§ãããã£ãŒãã©ãŒãã³ã°ã«é¢ããŠã¯ãæ¬æžã®ä»ã«æ°å€ãã®åªããè³æããããããã§ãããã®äžéšã«è§ŠããŠããããšã«ããŸãããããã®ãªãœãŒã¹ã§ã¯ãç¹å®ã®ãããã¯ã«ã€ããŠã®ãã詳现ãªèª¬æããæ¬æžã§æ±ããªããããã¯ïŒäŸãã°ãç»åèªèïŒã«ã€ããŠã®èª¬æããªãããŠããŸãããŸããã£ãŒãã©ãŒãã³ã°ã®å ¥éã«é¢ããŠã¯ãIan Goodfellowã®æ¬ãè¯ãå ¥éæžã ãšæããŸãããããžã§ã³ã«ã€ããŠæ·±ãç¥ããããªããGrant Sandersonããã¥ãŒã©ã«ãããã¯ãŒã¯ã«ç¹åããshort video seriesãå ¬éããŠããããã®ãããã¯ã®å¿çšçãªçŽ¹ä»ããããŠããŸããDeepMindã¯ããã£ãŒãã©ãŒãã³ã°ãšAIã§äœãå®çŸã§ãããã瀺ããã€ã¬ãã«ãªãããªãå ¬éããŠããŸããããç 究è«æã§ããã£ãŒãã©ãŒãã³ã°ã¯åŒ·åãªããŒã«ã§ããããšæžãå Žåãäžè¬çã«ã¯ã€ã³ã»ã«ã«ã³ããšã·ã¥ã¢ã»ãã³ãžãªããžã§ããªãŒã»ãã³ãã³ã«ããå·çããNatureã«æ²èŒããã [ãã®è«æ] (https://www.nature.com/articles/nature14539)ãåŒçšãããããšãå€ãã§ããããZhangã Liptonã Liãããã³ Smolaã¯ãTensorflowãPyTorchãMXNetãšãã£ã代衚çãªãã£ãŒãã©ãŒãã³ã°ãã¬ãŒã ã¯ãŒã¯ã§å®è£ ãããexampleãå«ãå®çšçãªãªã³ã©ã€ã³ããã¯ãå ¬éããŠããŸãããŸããDeepChemãããžã§ã¯ãã§ã¯ãååŠã«ããããã£ãŒãã©ãŒãã³ã°ã®å¿çšã«ã€ããŠãååŠã«ãã©ãŒã«ã¹ããå€ãã®exampleãšæ å ±ãæäŸãããŠããŸããæåŸã«ãããã€ãã®ãã£ãŒãã©ãŒãã³ã°ããã±ãŒãžã¯ããã®APIã®ãã¥ãŒããªã¢ã«ãä»ããŠæ·±å±€åŠç¿ã®çãå°å ¥ãæäŸããŸãïŒ Keras, PyTorch.
ç§ããã£ãŒãã©ãŒãã³ã°ã®åå¿è ã«äŒããäž»ãªã¢ããã€ã¹ã¯ãç¥çµåŠã«çæ³ãåŸãçšèªãæŠå¿µïŒããªãã¡ããã¥ãŒãã³éã®æ¥ç¶ïŒã«ã¯ããŸãåãããããããã®ä»£ããã«ãã£ãŒãã©ãŒãã³ã°ãã調æŽå¯èœãªãã©ã¡ãŒã¿ãããããå«ãè¡åã䜿ã£ãäžé£ã®ç·åœ¢ä»£æ°æŒç®ãšããŠæããããšã§ãããã¡ããããã£ãŒãã©ãŒãã³ã°ã®ç·åœ¢ä»£æ°æŒç®ãçµåããããã«äœ¿ãããéç·åœ¢é¢æ°ïŒæŽ»æ§åïŒãªã©ãç¥çµåŠãšé¡äŒŒããæŠå¿µããšããã©ããã«ç»å ŽããŸããããã¥ãŒã©ã«ãããã¯ãŒã¯ã¯ç¥çµåŠã®å»¶é·ã«ãããã®ã§ã¯ãªãããããšã¯åãé¢ãããå¥ãªãã®ãšããŠåŠã¶ããšãé©åãšèšããŸããäŸãè³å ã§æ¥ç¶ããããã¥ãŒãã³ã®ããã«èŠãããšããŠãâââãã£ãŒãã©ãŒãã³ã°ã¯æ¬è³ªçã«ããèšç®ãããã¯ãŒã¯ãïŒèšç®ã°ã©ããšãåŒã°ããïŒã«ãã£ãŠèšè¿°ãããç·åœ¢ä»£æ°æŒç®ã§ãã
éç·åœ¢æ§
é¢æ° \(f(\vec{x})\) ã次ã®2ã€ã®æ¡ä»¶ãæºãããªãã\(f(\vec{x})\)ã¯ç·åœ¢ã§ããïŒ
ä»»æã® \(\vec{x}\) ããã³ \(\vec{y}\) ã«ã€ããŠã
ãŸãã
ãã㧠\(s\) ã¯ã¹ã«ã©ãŒã§ããã ãã\(f(\vec{x})\)ããããã®æ¡ä»¶ãæºãããªãå Žåã\(f(\vec{x})\)㯠éç·åœ¢ ã§ããã
6.1. ãã¥ãŒã©ã«ãããã¯ãŒã¯ãšã¯ïŒÂ¶
ãã£ãŒãã©ãŒãã³ã°ã«ããã ãã£ãŒã ãšã¯ããã¥ãŒã©ã«ãããã¯ãŒã¯ãäœå±€ãã®ã¬ã€ã€ãŒããæ§æãããããšãæå³ããŸããã§ã¯ããã¥ãŒã©ã«ãããã¯ãŒã¯ãšã¯äœã§ããããïŒäžè¬åããèšãæ¹ãããã°ããã¥ãŒã©ã«ãããã¯ãŒã¯ã¯2ã€ã®èŠçŽ ã§æ§æããããšèããããšãã§ããŸãïŒïŒ1ïŒ å ¥åç¹åŸŽ \(\mathbf{X}\) ã«å¯ŸããŠéç·åœ¢å€æ \(g(\cdot)\) ãé©çšããæ°ããç¹åŸŽ \(\mathbf{H} = g(\mathbf{X})\) ãåºåããéšåãïŒ2ïŒ æ¢ã« æ©æ¢°åŠç¿å ¥é ã§èŠããããªç·åœ¢ã¢ãã«ãæã ã®ãã£ãŒãã©ãŒãã³ã°ã«ããååž°ã¢ãã«ã®åŒã¯æ¬¡ã®ããã«ãªã£ãŠããŸãã
MLã®ç« ã§ã¯ãç¹åŸŽéã®éžæãããã«é£è§£ãã€å°é£ããšããç¹ãäž»ã«è°è«ãããŠããŸãããããã§ã¯ããããŸã§äººæã§èšèšãããŠããç¹åŸŽéããåŠç¿å¯èœãªç¹åŸŽã®éå \(g(\vec{x})\) ã«çœ®ãæãããããŸã§ãšåãç·åœ¢ã¢ãã«ã䜿ããŸããããã§ã¯ \(g(\vec{x})\) ã¯ã©ã®ããã«èšèšããã°ããã®ããšæ°ã«ãªãã§ãããããããããŸãã«ãã£ãŒãã©ãŒãã³ã°ã®éšåã§ãã \(g(\vec{x})\) 㯠ã¬ã€ã€ãŒ ïŒå±€ïŒ ã«ãã£ãŠæ§æããã埮åå¯èœãªé¢æ°ã§ãå±€ããèªèº«ã埮åå¯èœãã€ãåŠç¿å¯èœãªéã¿ïŒèªç±å€æ°ïŒãæã¡ãŸãããã£ãŒãã©ãŒãã³ã°ã¯æçããåéã§ãããç®çããšã«æšæºçãªå±€ã確ç«ãããŠããŸããäŸãã°ãç³ã¿èŸŒã¿å±€ã¯ãå ¥åãã³ãœã«ã®åèŠçŽ ã®åšèŸºã«ã€ããŠãåºå®ãããåºãã§è¿åãèŠãããã«äœ¿ãããããããã¢ãŠãå±€ã¯ãæ£ååã®äžçš®ãšããŠå ¥åããŒããã©ã³ãã ã«äžæŽ»æ§åããããã«äœ¿ãããŸããæããã䜿ãããåºæ¬çãªå±€ã¯ãå šçµåå±€ ïŒfully-connected layerïŒ ããã㯠å¯çµåå±€ïŒdense layerïŒ ãšåŒã°ãããã®ã§ãïŒèš³æ³šïŒä»¥äžã®èª¬æã§ãåæã§ã¯dense layerã®åŒç§°ã䜿ãããŠããŸãããæ¥æ¬èªã§ã¯å šçµåå±€ãšããŠããŸãïŒã
å šçµåå±€ã¯ãææã®åºåç¹åŸŽã®shapeãšæŽ»æ§åã®2ã€ã§å®çŸ©ãããŸããå šçµåå±€ã®åŒã¯æ¬¡ã®ããã«ãªããŸãïŒ
ããã§ã\(\mathbf{W}\) ã¯åŠç¿å¯èœãª \(D \times F\) è¡åã\(D\) ã¯å ¥åãã¯ãã« (\(vec{x}\)) ã®æ¬¡å ã\(F\) ã¯åºåãã¯ãã« (\(vec{h}\)) ã®æ¬¡å ã\(vec{b}\) ã¯åŠç¿å¯èœãª \(F\) 次å ãã¯ãã«ã\(\sigma(\cdot)\) ã¯æŽ»æ§åé¢æ°ã§ããåºåç¹åŸŽæ°ã® \(F\) ã \(\sigma(\cdot)\) ã¯ãã®ã¢ãã«ã®ãã€ããŒãã©ã¡ãŒã¿ã®äžã€ã§ãã€ãŸãã¢ãã«ã®åŠç¿æã«èªåçã«ç²åŸãããå€ã§ã¯ãªããåé¡ããšã«ãã¥ãŒãã³ã°ãã¹ãå€ã§ãã 掻æ§åé¢æ°ã«ã¯ã埮åå¯èœãã€å€åã \((-\infty, \infty)\) ã§ããã°ãåºæ¬çã«ã©ã®ãããªé¢æ°ã䜿çšã§ããŸãããã ãã掻æ§åé¢æ°ãç·åœ¢é¢æ°ã®å Žåãè€æ°ã®å šçµåå±€ãéãããšããŠãåã«è¡åãå®æ°åããããšãšç䟡ã§ãããçµå±ã®ãšããç·åœ¢ååž°ã«ãªã£ãŠããŸããŸãããã£ãŠããã¥ãŒã©ã«ãããã¯ãŒã¯ãéç·åœ¢æ§ããã€é¢æ°ãè¡šçŸã§ããããã掻æ§åé¢æ°ã«ã¯éç·åœ¢ãªé¢æ°ãçšããã®ãæ®éã§ãããŸãéç·åœ¢æ§ã ãã§ãªãã掻æ§åé¢æ°ã¯ãªã³ã»ãªããå¯èœãã€ãŸããå ¥åå€ã®ããé åã«å¯ŸããŠåºåå€ã0ã«ãªãæ§è³ªãå¿ èŠã§ããäžè¬ã«ãè² ã®å ¥åã«å¯ŸããŠæŽ»æ§åé¢æ°ã¯ãŒããããã¯ããã«è¿ãå€ã«ãªããŸãã
ããã2ã€ã®æ§è³ªãåããæãã·ã³ãã«ãªæŽ»æ§åé¢æ°ã¯ãæ£èŠåç·åœ¢é¢æ°ïŒrectified linear unit, ReLU)ã§ã以äžã®ããã«è¡šãããŸãïŒ
6.1.1. äžèœè¿äŒŒå®ç¶
ãã¥ãŒã©ã«ãããã¯ãŒã¯ãæªç¥é¢æ° (\(f(\vec{x})\)) ã®è¿äŒŒã«é©ããŠããçç±ã®äžã€ã¯ããããã¯ãŒã¯ã®æ·±ãïŒå±€ã®æ°ïŒãå¹ ïŒé ãå±€ã®å€§ããïŒãååã«å€§ãããã°ãã©ããªé¢æ°ã§ãè¿äŒŒã§ããããšã§ãïŒäžèœè¿äŒŒå®çïŒããã®å®çã«ã¯å€ãã®ããªãšãŒã·ã§ã³ããããç¡éã«åºããã¥ãŒã©ã«ããããç¡éã«æ·±ããã¥ãŒã©ã«ãããã«ã€ããŠã®èšŒæã瀺ãããŠããŸããäŸãã°ãä»»æã®1次å ã®é¢æ°ã¯ãç¡éã«åºãå±€ïŒç¡éã®é ã局次å ïŒãæã¡æŽ»æ§åé¢æ°ãšããŠReLUãçšãããããæ·±ã5ã®ãã¥ãŒã©ã«ãããã¯ãŒã¯ã§è¿äŒŒã§ããããšãç¥ãããŠããŸã[LPW+17]ã
6.1.2. Frameworks¶
ãã£ãŒãã©ãŒãã³ã°ã®å®è£ ã«ã¯ãå€ãã®ãããããã€ã³ããïŒâ»ïŒãã€ãŸã容æã«ãã¹ãç¯ããŠããŸããã€ã³ããããããããããããã¥ãŒã©ã«ãããã¯ãŒã¯ãåŠç¿ã«å¿ èŠãªæ§ã ãªæ©èœãèªåã§ãŒãããå®è£ ããããšã¯é£ããã§ããç¹ã«æ°å€çãªäžå®å®æ§ã¯ãã¢ãã«ãåŠç¿ã«å€±æããæã«ã¯ãããŠæ°ã¥ãå Žåãå€ããåä»ãªåé¡ã«ãªããŸãã ããã§ãã®æ¬ã®ããã€ãã®äŸã§ã¯ãJAXã®ä»£ããã«å°ãæœè±¡çãªãœãããŠã§ã¢ãã¬ãŒã ã¯ãŒã¯ãçšããããšã«ããŸããããã§ã¯ã人æ°ã®ãããã£ãŒãã©ãŒãã³ã°ãã¬ãŒã ã¯ãŒã¯ã®1ã€ã§ããKerasãçšããŸããKerasã¯é«åºŠãªåŠçãéåžžã«ã·ã³ãã«ãªã³ãŒãã§èšè¿°ã§ããããšããããã£ãŒãã©ãŒãã³ã°ã®å®äŸãç°¡æœã«ç€ºãããã«é©ããŠãããšèšããŸãã
6.1.3. ãã£ã¹ã«ãã·ã§ã³Â¶
ãã®æ¬ã§ã¯ããã£ãŒãã©ãŒãã³ã°ãã®ãã®ã®çŽ¹ä»ã¯ãªãã¹ããã£ãããšæžãŸããŸããäžã®äžã«ã¯ãã£ãŒãã©ãŒãã³ã°ã«é¢ããåªããåŠç¿ææããããŸããäžèšã®èªã¿ç©ããKerasïŒãŸãã¯PyTorchïŒã®ãã¥ãŒããªã¢ã«ã䜿ã£ãŠããã¥ãŒã©ã«ãããã¯ãŒã¯ãåŠç¿ã®æŠå¿µã«æ £ããŠãããšããã§ãããã
6.2. 溶解床äºæž¬ã¢ãã«ãæ¯ãè¿ã¶
ãã£ãŒãã©ãŒãã³ã°ã®æåã®äŸãšããŠã2å±€ã®å šçµåãã¥ãŒã©ã«ãããã¯ãŒã¯ã䜿ã£ãŠæº¶è§£åºŠããŒã¿ã»ãããå床åŠç¿ããŠã¿ãŸãããã
6.3. ãã®ããŒãããã¯ã®åããæ¹Â¶
ãã®ããŒãžäžéšã®   ãæŒããšããã®ããŒãããã¯ãGoogle Colab.ã§éãããŸããå¿ èŠãªããã±ãŒãžã®ã€ã³ã¹ããŒã«æ¹æ³ã«ã€ããŠã¯ä»¥äžãåç §ããŠãã ããã
Tip
å¿ èŠãªããã±ãŒãžãã€ã³ã¹ããŒã«ããã«ã¯ãæ°èŠã»ã«ãäœæããŠæ¬¡ã®ã³ãŒããå®è¡ããŠãã ããã
!pip install dmol-book
ããã€ã³ã¹ããŒã«ãããŸããããªãå Žåãããã±ãŒãžã®ããŒãžã§ã³äžäžèŽãåå ã§ããå¯èœæ§ããããŸããåäœç¢ºèªããšããææ°ããŒãžã§ã³ã®äžèŠ§ã¯ããããåç §ã§ããŸã
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import dmol
6.3.1. ããŒã¿ã®èªã¿èŸŒã¿Â¶
ããŒã¿ãããŠã³ããŒãããPandasã®data frameãšããŠèªã¿èŸŒã¿ãŸããããã«ã以åãšåãããã«ããŠç¹åŸŽéãæ£èŠåããŸãã
# soldata = pd.read_csv('https://dataverse.harvard.edu/api/access/datafile/3407241?format=original&gbrecs=true')
# had to rehost because dataverse isn't reliable
soldata = pd.read_csv(
"https://github.com/whitead/dmol-book/raw/master/data/curated-solubility-dataset.csv"
)
features_start_at = list(soldata.columns).index("MolWt")
feature_names = soldata.columns[features_start_at:]
# standardize the features
soldata[feature_names] -= soldata[feature_names].mean()
soldata[feature_names] /= soldata[feature_names].std()
6.4. Kerasçšã«ããŒã¿ãæºå¶
ãã£ãŒãã©ãŒãã³ã°ã©ã€ãã©ãªã䜿ãããšã§ãããŒã¿ã®åå²ãåã¬ã€ã€ãŒã®æ§ç¯ãªã©ãå€ãã®äžè¬çã¿ã¹ã¯ãç°¡åã«è¡ããŸãã以äžã®ã³ãŒãã§ã¯ãnumpyã®arrayããKerasã®ããŒã¿ã»ãããæ§ç¯ããŸãã
full_data = tf.data.Dataset.from_tensor_slices(
(soldata[feature_names].values, soldata["Solubility"].values)
)
N = len(soldata)
test_N = int(0.1 * N)
test_data = full_data.take(test_N).batch(16)
train_data = full_data.skip(test_N).batch(16)
ãã®ã³ãŒãå
ã«ããã skip
ã take
(次ãåç
§ tf.data.Dataset
) ã¯ãããŒã¿ã»ããã2ã€ã«åå²ããããããããäœãæäœã§ãã
6.5. ãã¥ãŒã©ã«ãããã¯ãŒã¯Â¶
ã§ã¯ããã¥ãŒã©ã«ãããã¯ãŒã¯ã¢ãã«ãæ§ç¯ããŸãããããã®å Žåã \(g(\vec{x}) = \sigma\left(\mathbf{W^0}\vec{x} + \vec{b}\right)\) ãšãªããŸãããã®é¢æ° \(g(\vec{x})\) ã é ãå±€ïŒhidden layerïŒ ãšåŒã¶ããšã«ããŸããããã¯ãæã 㯠\(g(\vec{x})\) ã®åºåããã®ãŸãŸæçµçãªçµæãšããŠæ±ãããã§ã¯ãªãããã§ããäºæž¬ããã溶解床㯠\(y = \vec{w}g(\vec{x}) + b\) ãšãªãããšã«æ³šæããŠãã ããã掻æ§åé¢æ° \(sigma( \cdot)\) ã¯tanhãé ãå±€ã®åºå次å ã¯32ãšããŸããéç·åœ¢ãªæŽ»æ§åé¢æ°ã¯ãããããããŸãããããã§tanhãéžãã ã®ã¯çµéšçãªçç±ã§ãããã®ããã«ã掻æ§åé¢æ°ã¯äžè¬çã«å¹çãšçµéšçãªæ§èœã«åºã¥ããŠéžæãããŸãã
# our hidden layer
# We only need to define the output dimension - 32.
hidden_layer = tf.keras.layers.Dense(32, activation="tanh")
# Last layer - which we want to output one number
# the predicted solubility.
output_layer = tf.keras.layers.Dense(1)
# Now we put the layers into a sequential model
model = tf.keras.Sequential()
model.add(hidden_layer)
model.add(output_layer)
# our model is complete
# Try out our model on first few datapoints
model(soldata[feature_names].values[:3])
<tf.Tensor: shape=(3, 1), dtype=float32, numpy=
array([[-0.19586785],
[-0.41813114],
[-0.11751032]], dtype=float32)>
äžã®3ã€ã®ååã«å¯Ÿãã溶解床ãäºæž¬ããã¢ãã«ãã§ããŸãããPandasã®ããŒã¿ã¯ããŒã¿ç²ŸåºŠã«float64ïŒå粟床浮åå°æ°ç¹ïŒã䜿ã£ãŠããã®ã«ãæã ã®ã¢ãã«ã¯float32ïŒå粟床ïŒã䜿ã£ãŠãããšããèŠåãåºããããããŸããããããã¯ããã»ã©éèŠã§ã¯ãããŸããããã®èŠåã¯æè¡ççç±ããæ°å€ã®ç²ŸåºŠãå°ãèœãšããŠããããã§ãããååã®æº¶è§£åºŠã¯32ããããš64ããã粟床ã®æµ®åå°æ°ç¹ã«ãã誀差ãããã¯ããã«å€ãã®åæ£ãæã£ãŠããããšããããã®èª€å·®ã®åœ±é¿ã¯ç¡èŠã§ããŸããæåŸã®è¡ã次ã®ããã«ä¿®æ£ããã°ããã®èŠåãæ¶ãããšãã§ããŸãã
model(soldata[feature_names].values[:3].astype(float))
ãããŸã§ã§ãæã
ã¯ãã£ãŒããã¥ãŒã©ã«ãããã¯ãŒã¯ã®ã¢ãã«æ§é ãç®çã«åŸã£ãŠå®çŸ©ããããŒã¿ã«å¯ŸããŠåŒã³åºãããšãã§ããããã«ãªããŸãããããšã¯ãã®ã¢ãã«ãåŠç¿ãããã ãã§ããmodel.compile
ãåŒã³åºããæé©ååšïŒéåžžã¯ç¢ºççåŸé
éäžæ³ã®äžçš®ïŒãšæ倱é¢æ°ãå®çŸ©ããããšã§ãåŠç¿çšã¢ãã«ã®æºåã¯å®äºã§ãã
model.compile(optimizer="SGD", loss="mean_squared_error")
Kerasã䜿ããšç°¡åã«ãã£ãŒãã©ãŒãã³ã°ã¢ãã«ãå®çŸ©ã§ããããšã«æ°ã¥ããŸãããïŒ ä»¥åã«æ倱ãšæé©ååšãæºåããããã«ããã£ãåŽåãæ¯ãè¿ã£ãŠã¿ãŠãã ãããããããã£ãŒãã©ãŒãã³ã°ãã¬ãŒã ã¯ãŒã¯ã䜿ãå©ç¹ã§ããã§ã¯ãããã§ã¢ãã«ãåŠç¿ããããã®æºåãæŽããŸããã
model.fit(train_data, epochs=50)
åŠç¿ãç°¡åã§ããïŒ
åèãŸã§ã«ãååã®ããŒã¹ã©ã€ã³ã¢ãã«ã§ã¯lossã®åºåã¯3ãããã§ããããŸããKerasã«ããæé©åã«ãããåŠçã¯ããé«éã«ãªããŸãããã§ã¯ããã®ã¢ãã«ã®ãã¹ãããŒã¿ã«ãããæ§èœãèŠãŠã¿ãŸãããã
# get model predictions on test data and get labels
# squeeze to remove extra dimensions
yhat = np.squeeze(model.predict(test_data))
test_y = soldata["Solubility"].values[:test_N]
plt.plot(test_y, yhat, ".")
plt.plot(test_y, test_y, "-")
plt.xlabel("Measured Solubility $y$")
plt.ylabel("Predicted Solubility $\hat{y}$")
plt.text(
min(test_y) + 1,
max(test_y) - 2,
f"correlation = {np.corrcoef(test_y, yhat)[0,1]:.3f}",
)
plt.text(
min(test_y) + 1,
max(test_y) - 3,
f"loss = {np.sqrt(np.mean((test_y - yhat)**2)):.3f}",
)
plt.show()

æ§èœããããŒã¹ã©ã€ã³ã¢ãã«ããããªãè¯ãããšããããïœãã This performance is better than our simple linear model.
6.6. ç·Žç¿åé¡Â¶
ReLUé¢æ°ãã°ã©ãã«ããããããReLUãéç·åœ¢é¢æ°ã§ããããšã確èªãã
ãã€ã¢ã¹ã»ããªã¢ã³ã¹ãã¬ãŒããªãã®ããšã¯äžæŠå¿ããŠããã¥ãŒã©ã«ãããã¯ãŒã¯ã®å±€ã®æ°ãå¢ããããšã«ãã£ã¬ã³ãžããŠã¿ãŠãã ãã
\(\sigma(\cdot)\) ãæçé¢æ°ã§ããã°ããã¥ãŒã©ã«ãããã¯ãŒã¯ã¯ç·åœ¢ååž°ã§ããããšã瀺ãã
ããŒã¿ã®ãã£ããã£ã³ã°ã«ãéç·åœ¢ååž°ã§ã¯ãªãããã£ãŒãã©ãŒãã³ã°ã䜿ãã¡ãªãããšãã¡ãªããã¯äœã§ããïŒãŸããã©ã®ãããªå Žåã«ãã£ãŒãã©ãŒãã³ã°ã§ã¯ãªãéç·åœ¢ååž°ãéžæããŸããïŒ
6.7. ãã®ç« ã®ç®æ¬¡Â¶
ãã£ãŒãã©ãŒãã³ã°ãšã¯ãæ©æ¢°åŠç¿ã®äžçš®ã§ããã¥ãŒã©ã«ãããã¯ãŒã¯ãå©çšããŠããŒã¿ã®åé¡ãååž°ãè¡ããã®ã§ãã
ãã¥ãŒã©ã«ãããã¯ãŒã¯ã¯ã調æŽå¯èœãªãã©ã¡ãŒã¿ãæã€è¡åãçšããäžé£ã®æŒç®ã§ãã
ãã¥ãŒã©ã«ãããã¯ãŒã¯ã¯ãå ¥åãããç¹åŸŽãããã®åŸååž°ãåé¡ã«äœ¿çšã§ããæ°ããç¹åŸŽã®éåã«å€æããŸãã
æãäžè¬çãªå±€ã¯å šçµåå±€ã§ãå šçµåå±€ã«ãããŠåå ¥åèŠçŽ ã¯ååºåèŠçŽ ã«åœ±é¿ãäžããŸããå šçµåå±€ã®ãã©ã¡ãŒã¿ã¯æãŸããåºåç¹åŸŽã®åœ¢ãšæŽ»æ§åé¢æ°ã«ãã£ãŠå®çŸ©ãããŸãã
ååãªæ°ã®å±€ããŸãã¯ååãªå¹ ã®é ãå±€ãããã°ããã¥ãŒã©ã«ãããã¯ãŒã¯ã¯æªç¥ã®é¢æ°ãè¿äŒŒããããšãã§ããŸãã
é ãå±€ã¯ããã®åºåãæã ã¯æ®æ®µèŠ³æž¬ããªãããããã®ããã«åŒã°ããŠããŸãã
TensorFlowãªã©ã®ã©ã€ãã©ãªã䜿ããšãããŒã¿ããã¬ãŒãã³ã°ãšãã¹ãã«åããã ãã§ãªãããã¥ãŒã©ã«ãããã¯ãŒã¯ã®å±€ãæ§ç¯ããããšã容æã«ãªããŸãã
ãã¥ãŒã©ã«ãããã¯ãŒã¯ãæ§ç¯ããããšã§ãååã®æº¶è§£åºŠãªã©æ§ã ãªç©æ§ãäºæž¬ããããšãã§ããŸãã
6.8. Cited References¶
- LPW+17
Zhou Lu, Hongming Pu, Feicheng Wang, Zhiqiang Hu, and Liwei Wang. The expressive power of neural networks: a view from the width. In Proceedings of the 31st International Conference on Neural Information Processing Systems, 6232â6240. 2017.