EihiS

December 10, 2016

Neural nets , deep learning, bases en C - part 1

Filed under: linux, neuron — Tags: , , , , , , — admin @ 7:15 pm

Les reseaux composés de plus d’une couche cachée sont plus difficiles que les autres à entrainer.
On trouve ça-et-la des articles sur le sujet , bourrés de maths, qui ont de quoi rebuter.

Les arXiv publiés sont nombreux..

Pour un réseau type “perceptron” à une couche ( N entrées vers une sortie unique) , partons sur la base suivante ( code C )
Attention: j’utilise des fonctions ‘maison’ qui peuvent différer des formes ‘canoniques’ si tant est qu’elles existent.
Je suis un codeur plus qu’un matheux. les programmes originaux sont souvent ‘mathéifiés’ par la suite, et non l’inverse.

Le calcul du reseau inverse, justement, est , si l’on pousse la reflection sur son mode de calcul, analogue aux phénomènes de mouvements oculaires rapides,et mouvements inconscients pendant les phases de sommeil, lors desquelles il est supposé (ou prouvé?) que le cerveau ‘renforce’ les liens neuronaux en ‘répétant’ des stimulis de la periode d’eveil. les vagues EEG mettent en evidence des mouvements de va-et-vient de l’avant a l’arrière du cortex des signaux electriques dans les reseaux de neurones cérébraux.

Les fonctions décrites ci-après agissent sur cette structure simple, qui décrit le reseau (désolé pour les ‘mix’ entre anglais,français. mes codes sont tjs commentés en anglais, j’ai étoffé en français pour cet post )  :

//
typedef struct {
	uint32_t synapses;// nombre d'entrées du neurone
	float* in;	// forward compute inputs
	float* back;	// backward computed inputs
	float* w;	// forward/backward weight for each inputs
	float* part;    // proportion rapportée a 1.0 pour chaque poids entrant
	float neuron;	// forward computed output, backward compute input
	float error;    // erreur en sortie
	float wtot;			// total input weights
	//
	uint64_t epoch;	// incrémenté a chaque function d'apprentissage appliquée (modification des poids)
	uint64_t internal;// incrémenté a chaque calcul de la sortie en fonction des entrées
} reso;

Une fonction qui initialise cette structure :

void RESO_init(reso* n,uint32_t input_count)
{
	uint32_t k;
	float init_w=1.0;// todo: use setup, default var instead
	n->in=(float*) malloc(sizeof(float)*input_count);
	n->back=(float*) malloc(sizeof(float)*input_count);
	n->w=(float*) malloc(sizeof(float)*input_count);
	n->part=(float*) malloc(sizeof(float)*input_count);
	//
	n->synapses=input_count;
	n->wtot=init_w*input_count;	//prepare startup total weights
	for(k=0;k<n->synapses;k++)
	{
		n->in[k]=0.0;
		n->w[k]=init_w;	// default...
		n->part[k]=n->w[k]/n->wtot;// back ready
	}
	n->neuron=0.0;
        n->error=0.0;
	//
	n->epoch=1;
	n->internal=1;
}

Maintenant, la fonction de calcul ‘FORWARD’ : Calcule la sortie en fonction des entrées

void RESO_forward(reso* n)
{
	uint32_t i;
	n->neuron=0.0;
	n->wtot=0.0;
	for(i=0;i<n->synapses;i++)
	{
	  n->neuron+=n->in[i]*n->w[i]; // somme des entrées*poids de cette entrée
	  n->wtot+=fabs(n->w[i]);      // cumule au passage les poids pour calculer le total des poids entrants
	}
	n->internal++;
}

On passe a deux fonctions essentielles dans l’apprentissage:

la fonction ‘BACKWARD’ qui effectue un calcul inverse (de la sortie vers les entrées), mais ne modifie aucun des poids du réseau :

//
void RESO_back(reso* n,float expected)
{
	uint32_t k;
	n->error=expected - n->neuron;	// AKA 'expected - what_i_got'
	//
	n->neuron=n->error;// prepare for back compute
	//
	for(k=0;k<n->synapses;k++)
	{
		n->part[k]=fabs(n->w[k])/n->wtot;	 // weight part always positive
		n->back[k]=n->neuron*n->part[k]*n->in[k];// back[] is the backward equivalent of in[]'s for the net
	}
	n->epoch++;
}

.. je reviendrai plus tard sur celle ci ainsi que sur la suivante : la fonction de modification du reseau ,qui s’execute après le calcul inverse (fonction précédente) :

void RESO_apply(reso* n,float rate)
{
	uint32_t k;
	for(k=0;k<n->synapses;k++)
	{
		n->w[k]+=n->back[k]*rate;
	}
}

En pratique, on peut tester tout ça avec un programme simple :

#include "../common/classics.h" // check for a post on the website about it
// ..les librairies classiques requises
#include "../common/neuron/RESO_library.c"
// c'est le nom que j'utilise
// pour le fichier ou sont les fonctions précédentes et la definition de la structure
int main(int argc, char **argv)
{
	// le reseau
	reso snet;
	reso* net=&snet;			// pointeur vers la structure
	//
	float expected;				// utile pour stocker ce qu'on attend en reponse
	float RMSerror,RMScompute;
	float ang;					// pour un angle, sinus/cosinus cf while()

	//
	uint32_t i;		// des variables d'utilité..
	//
	//
	RESO_init(net,3);// on initialise le reseau, avec 3 entrées
	// tous les poids a 1.0 pour tester
	net->w[0]=1.0;
	net->w[1]=1.0;
	net->w[2]=1.0;
	//
	RMSerror=0.0;
	RMScompute=1.0;
	// on définit ici pour les essais : les 3 entrées,et la sortie attendue par formule
	#define _INA	3.0
	#define _INB	0.0+sin(ang)
	#define _INC	0.0+cos(ang)
	// on pond une formule qui utilise les 3 entrées, pour tester le reseau
	#define _REPON	1.0+2.0*cos(ang)+3.0*sin(ang)
	//
	ang=0.0;
	while((RMScompute>0.01)||(net->epoch<2))// jusqu'a RMS erreur total <= 0.01
	{
		//
		net->in[0]=_INA;
		net->in[1]=_INB;
		net->in[2]=_INC;
		//
		expected=_REPON;
		ang+=2.6;// incrém. pour le prochain cycle du while, 2.6 est choisi comme ça..
		// *** CALCUL : FORWARD
		RESO_forward(net);
		// *** CALCUL : BACKWARD ( pour obtenir l'erreur de la sortie n->neuron )
		//  expected contient la valeur attendue, fournie a la fonction
		RESO_back(net,expected);
		// au passage, on fait un cumul d'erreur RMS pour superviser l'apprentissage du reseau
		RMSerror+=fabs(net->error);
		RMScompute=(RMSerror/(float)net->epoch);
		// epoque de l'apprentissage, erreur immédiate dela sortie, et erreur RMS total depuis le début
		printf("\nepoch[%6.6lu] e(%+6.6f) , RMS{%5.5f} ",net->epoch,net->error,RMScompute);
		printf("wtot[%+3.3f] --",net->wtot);	// au passage, le poids total des entrées a ce cycle
		// les 'n' entrées (ici 3) , leur part en % dans le résultat que fournit la sortie
		for(i=0;i<net->synapses;i++) { printf("(%+3.3f%c)",net->part[i]*100.0,'%'); }
		// final : on modifie le reseau, et rebouclage while()
		// on fournit la 'learning_rate' a la fonction ( vitesse d'apprentissage ) ,
		// c'est a dire en quelle quantité elle va venir modifier les poids.
		// le learning rate agit en dosage sur la modification.
		RESO_apply(net,0.05); // *** APPLIQUER modification aux poids ***

	}
	// si le seuil RMS est ok, affiche les infos finales poids/ part dans le résultat
	printf("\nFinal Parts :\n");
	for(i=0;i<net->synapses;i++) { printf("(%+3.3f%c)",net->part[i]*100.0,'%'); }
	printf("\nFinal Weights:\n");
	for(i=0;i<net->synapses;i++) { printf("(%+3.3f )",net->w[i]);}
	printf("\n");
	// on execute un cycle de calcul FORWARD uniquement, pour vérifier que
	// tout est OK , 20 fois :
	// 'test run'
	net->internal=0;
	while(net->internal<20)
	{
		net->in[0]=_INA;
		net->in[1]=_INB;
		net->in[2]=_INC;
		expected=_REPON;// uniquement pour afficher l'attendu VS la sortie du reseau
		ang+=2.6;//
		RESO_forward(net);
		printf("\nCycle[%6.6lu] EXPECTED(%+6.6f) , ACTUAL{%5.5f} ",net->internal,expected,net->neuron);

	}
// TODO : malloc cleanups !
 return 0;// :)
}

Le reseau, dans son état actuel,  converge , et c’est le but.

On constate que les parts de chacune des entrées est exactement proportionnées tel que nécéssaire , a savoir :

Final Parts :
(+6.250%)(+56.250%)(+37.500%)
Final Weights:
(+0.333 )(+3.000 )(+2.000 )
A) On a in[0]=3.0 , poids=0.33333 ce qui fait  3.0*0.33333333 = 1.0
B) On a in[1]=sin(angle)  , poids = 3.0 ce qui fait 3.0*sin(angle)
C) On a in[2]=cos(angle) , poids = 2.0  ce qui fait 2.0*cos(angle)
La valeur du neurone etant la somme des entrées * leur poids respectifs,
on a donc neuron = A+B+C = 1.0 + 3.0*sin(angle) + 2.0*cos(angle)
.. c’est ce qu’on voulait, ça tombe bien… (expected : 1.0+2.0*cos(ang)+3.0*sin(ang) )

Suite au prochain numéro. ( si vous avez VRAIMENT besoin d’aide, ou des remarques constructives, contact : admin(AT)eihis.com)

314159265358979323846264338327950288
419716939937510582097494459230781640
628620899862803482534211706798214808

September 27, 2015

Neuron networks , part III

Filed under: Uncategorized, linux — Tags: , , , , , , — admin @ 8:46 am

the NEU.c header looks like:

// activate extra computation : NE_outPulse[] mode
#define _NEU_USES_PULSES_OUTPUTS
// activate extra computation : NE_outTrigger[], schmidt trigger tanh'd output
#define _NEU_USES_ASYMETRIC_OUTPUTS
//
#define _neu_asym_magnify 200.0//2.0 // 8.0  is default , schm-trig amplifier for tanh
#define _neu_asym_value 0.1 //  default 0.1 , scm-trig 'delta' from 0 , rising/falling. 0.1 : schmidt trig asymetry
#define _neu_asym_rescale (float) (( (float) _neu_asym_magnify+ (float) _neu_asym_magnify + (float) _neu_asym_value) / (float) (_neu_asym_magnify) )
//
// Hidden and output neurons (post) modes
//
#define _NEU_USES_TANH_HIDDEN
#define _NEU_USES_TANH_OUTPUT
// learning rates
#define _NEU_INITIAL_LR_HO (float)0.08
#define _NEU_INITIAL_LR_IH (float)0.008
// we use some globals :
int NE_numInputs;   //  define howmuch inputs
int NE_numPatterns; // howmuch patterns for learning if a set is created
int NE_numHidden;   // hidden layer neuron number
int NE_numOutputs;  // output layer neuron quantity
int NE_numEpochs;   // used to count learning epochs (convergence)
//
// NEU variables
int NE_patNum = 0;
double NE_errThisPat[_max_outputs];
double NE_outPred[_max_outputs];
// asymetrics
#ifdef _NEU_USES_ASYMETRIC_OUTPUTS
double NE_last_outPred[_max_outputs];	// for last state saves
double NE_outTrigger[_max_outputs];
#endif
// pulsed mode output (another , alternate , output mode)
#ifdef _NEU_USES_PULSES_OUTPUTS
double  NE_outPredPulsed[_max_outputs];
double NE_outPulse[_max_outputs];
double NE_outPeriod[_max_outputs];
#endif
//
double NE_RMSerror[_max_outputs];
double NE_bias_value = 0.0 ;	// variable bias - used for 1 of the inputs
//
double hiddenVal[_max_hidden];	// mximum hidden neuron count
double hiddenPulse[_max_hidden];
double hiddenPeriod[_max_hidden];
//
double weightsIH[_max_inputs][_max_hidden];
double weightsHO[_max_hidden][_max_outputs];
//
double trainInputs[_max_patterns+1][_max_inputs];
double trainOutput[_max_patterns][_max_outputs];
//

And the CalcNet() function ( heavy optimizations could be applied ) .btw, sorry for the bad identations (blog’s content editor is not good at this )

// calculates the network output : you setup NE_patNum. trainOutputs have to be set only if
// in learning mode, to calculate errors.
void NEU_calcNet(void)
{
    //
    int j,k;
    int i = 0;
    for(i = 0;i<NE_numHidden;i++)
    {
  hiddenVal[i] = 0.0;
  //
for(j = 0;j<NE_numInputs;j++)
{
if(trainInputs[NE_patNum][j]>1.0) trainInputs[NE_patNum][j]=1.0;
if(trainInputs[NE_patNum][j]<-1.0) trainInputs[NE_patNum][j]=-1.0;
hiddenVal[i] = hiddenVal[i] + (trainInputs[NE_patNum][j] * weightsIH[j][i]);
}
// uses tanh'd mode ?
#ifdef _NEU_USES_TANH_HIDDEN
hiddenVal[i] = tanh(hiddenVal[i]); // hidden state is tanh'd
#endif
// uses pulsed mode?
#ifdef _NEU_USES_PULSES_OUTPUTS
if((hiddenPulse[i]>-0.001)&&(hiddenPulse[i]<0.001))
{
hiddenPeriod[i]=abs(hiddenVal[i])*0.7;// assign new period
if (hiddenVal[i]>0.0) {hiddenPulse[i]=1.0;}else {hiddenPulse[i]=-1.0;} // set corresponding output pulse pos/neg
}/
else { hiddenPulse[i]*=0.7-hiddenPeriod[i];} // looses amplitude downto < abs 0.001
#endif
    }
   //calculate the output of the network
   //the output neuron is linear
   for (k=0;k<NE_numOutputs;k++)
   {
NE_outPred[k] = 0.0;
#ifdef _NEU_USES_PULSES_OUTPUTS
NE_outPredPulsed[k]=0.0;// pulsed mode intermediaire output
#endif
for(i = 0;i<NE_numHidden;i++)
{
NE_outPred[k] = NE_outPred[k] + hiddenVal[i] * weightsHO[i][k];
#ifdef _NEU_USES_PULSES_OUTPUTS
NE_outPredPulsed[k] = NE_outPredPulsed[k] + hiddenPulse[i] * weightsHO[i][k];
#endif
}
//calculate the error
NE_errThisPat[k] = NE_outPred[k] - trainOutput[NE_patNum][k];
// tanh'd
#ifdef _NEU_USES_TANH_OUTPUT
// uses asymetric (Triggers out) ?
#ifdef  _NEU_USES_ASYMETRIC_OUTPUTS
if(NE_outPred[k]-NE_last_outPred[k]>0.0) // rising edge
{ 
NE_last_outPred[k]=NE_outPred[k];
NE_outTrigger[k]=(NE_outPred[k]-_neu_asym_value)*_neu_asym_magnify;
//NE_outPred[k]-=0.1;
//NE_outPred[k]*=_neu_asym_magnify;
} 
else 
{ 
NE_last_outPred[k]=NE_outPred[k];	// falling edge
NE_outTrigger[k]=(NE_outPred[k]+_neu_asym_value)*_neu_asym_magnify;
} 
// tanh'd final result
NE_outTrigger[k]=tanh(NE_outTrigger[k])/_neu_asym_rescale;
#endif
// uses tanh'd output, with no asymetry
NE_outPred[k]=tanh(NE_outPred[k]); 
#endif
//
#ifdef _NEU_USES_PULSES_OUTPUTS
// pulse mode : uses ne_outprepulsed
if((NE_outPulse[k]>-0.001)&&(NE_outPulse[k]<0.001))
{
NE_outPeriod[k]=abs(NE_outPredPulsed[k])*0.7;// assign new period
if (NE_outPredPulsed[k]>0.0) {NE_outPulse[k]=1.0;}else {NE_outPulse[k]=-1.0;} // set corresponding output pulse pos/neg
}
else { NE_outPulse[k]*=0.7-NE_outPeriod[k];} // looses amplitude downto < abs 0.001
#endif
}
}

Here is an example of float weights settings  ( 1+1+8 inputs , 1 output )

//

Neuron network array
numInputs :10
numHidden :3
numOutputs :1
INPUT to HIDDEN weights:
INPUT[0]:0.7249 -0.3254 -0.0132 
INPUT[1]:2.6476 -1.7487 1.8793 
INPUT[2]:1.6000 -2.1098 -0.1300 
INPUT[3]:1.5954 -2.1106 -0.1713 
INPUT[4]:1.5973 -2.1053 -0.0823 
INPUT[5]:1.5956 -2.1127 -0.2124 
INPUT[6]:1.5922 -2.1073 -0.1447 
INPUT[7]:1.5945 -2.1101 -0.1476 
INPUT[8]:1.5912 -2.1103 -0.1890 
INPUT[9]:1.5921 -2.1062 -0.0817 
HIDDEN to OUTPUT weights:
OUTPUT[0]:1.8294 1.8561 -1.0271
// input 0 is the cell's state at this moment.
// input 1 is the bias input. the network works 'out of the box' with 1.0 as a start value
// input 2 to 9 are the 8 surrounding cell's values
// output 0 is obviously the new cell's value once Calcnet() function is executed.
//
// nb : this weights have been obtained using a kind of unsupervised network training 
// followings are other working results, in bulk..
Neuron network array
numInputs :10
numHidden :3
numOutputs :1

INPUT to HIDDEN weights:
INPUT[0]:-0.0229 0.2218 0.7648
INPUT[1]:-2.0418 1.8457 3.1429
INPUT[2]:0.2120 2.2448 1.8066
INPUT[3]:0.2278 2.2442 1.8049
INPUT[4]:0.1597 2.2419 1.8062
INPUT[5]:0.2398 2.2457 1.8072
INPUT[6]:0.1784 2.2429 1.8051
INPUT[7]:0.1867 2.2433 1.8063
INPUT[8]:0.2036 2.2434 1.8054
INPUT[9]:0.1458 2.2410 1.8064
HIDDEN to OUTPUT weights:
OUTPUT[0]:1.0483 -1.7011 1.6570 

Yet another one
Neuron network array
numInputs :10
numHidden :3
numOutputs :1

INPUT to HIDDEN weights:
INPUT[0]:-0.3511 -0.7469 -0.3062
INPUT[1]:-1.9654 -2.7043 -1.9516
INPUT[2]:0.0046 -1.6099 -2.3226
INPUT[3]:0.0888 -1.6007 -2.3241
INPUT[4]:0.0356 -1.6225 -2.3280
INPUT[5]:0.0363 -1.6139 -2.3275
INPUT[6]:-0.0243 -1.6261 -2.3254
INPUT[7]:-0.0636 -1.6227 -2.3232
INPUT[8]:-0.0350 -1.6153 -2.3215
INPUT[9]:-0.0603 -1.6307 -2.3269
HIDDEN to OUTPUT weights:
OUTPUT[0]:1.0404 -1.8403 1.8139
Finally, another one (for that particular one, EXACT conway's rules is outputed with a bias value of 2.3 )
Neuron network array
numInputs :10
numHidden :3
numOutputs :1

INPUT to HIDDEN weights:
INPUT[0]:0.9719 0.4218 -0.4701
INPUT[1]:3.6418 2.0367 -2.4421
INPUT[2]:1.8977 3.3946 -0.1189
INPUT[3]:1.8869 3.4056 0.0242
INPUT[4]:1.8758 3.4251 0.1371
INPUT[5]:1.9026 3.3867 -0.1617
INPUT[6]:1.8715 3.4222 0.2166
INPUT[7]:1.9007 3.3728 -0.1277
INPUT[8]:1.8897 3.4263 0.0333
INPUT[9]:1.8785 3.3993 0.1416
HIDDEN to OUTPUT weights:
OUTPUT[0]:1.1289 -1.1504 1.0407
314159265358979323846264338327950288
419716939937510582097494459230781640
628620899862803482534211706798214808

August 22, 2015

neuron networks, part 2

Filed under: Uncategorized — Tags: , , , , , — admin @ 2:53 pm

Following the previous article, the network was trained with free-running feedback .
In addition, a second output neuron was created, wich output, instead of following the conway’s game of life rules, was trained to be the sine of the expected normal, ‘game of life ruled’ output.

The trained network’s output for output 0 is almost the same (used 8 on hidden instead of 5 )

The screen capture of outputs 0 and 1 :

Left : output zero (normal, game of life ruled output) , and right : the ’sine’ like output 1 for the same 8 input cells  :

This networks’ complete weights dump :

Neuron network array
numInputs :2
numHidden :8
numOutputs :2
INPUT to HIDDEN weights:
INPUT[0]:18.4715 -15.7549 -21.2166 -19.4792 2.0692 -2.9851 -14.6416 -17.5079 
INPUT[1]:-8.0632 -13.0431 4.0268 -12.4184 -7.6292 -8.4492 12.2782 -7.1637 
HIDDEN to OUTPUT weights:
OUTPUT[0]:1.8568 1.2939 1.7122 -1.2514 0.8039 -0.7578 1.2156 -0.5770 
OUTPUT[1]:0.5297 -0.1719 0.1888 -0.7751 0.1120 -0.1462 0.2815 0.4162

This network uses tanh’d outputs on both hidden and output layer ( tanOutput[n]=tanh(50.0*NormalOut[n]) )

The output 1 shows groups of cells and highlights some interresting shapes that the normal output[0] does not permit to view :

————-

Now, the same network is trained the same way, but the output[1] with no tanh() function applied is graphed.This renders the subtle values for this output in the range -1.0 / 1.0. ( the supervision’s expected output[1] rule was : output[1] = ( actual_output[0] + the new 8 cell’s sum value ) divided by 2.0

————-

The network is then modified : we add 1 new input, namely the x coordinates of the 2D plane that’s rendering. the actual 2D area is a 64×64 cell array, so the 0-64 value for X will be maped to a -1.0 / 1.0 vector for this new input.
This time, output[1] is trained in a unmonitored manner again. we want to have output[1] to be the copy of the actual X value.

So, the new network as 3 inputs :

  1. input[0] is the sum of the 8 surrounding cells at T
  2. input[1] is the actual value of the output[0] at T-1 ( namely, the state of the cell at T )
  3. input[2] is the X coordinates of the cell beeing processed ( 0-64 range mapped to -1.0 / +1.0 )
The outputs are expected to be :
  1. output[0] is the result of applying the rulesof the Game of life.
  2. output[1] is expected to be the ‘image’ of the X coordinates, with nothing more.
remarks : the network is trained with 10 neurons. convergence is longer, because the input[2] value is seen as an unwanted value for the output[0] problem. for this reason, convergence takes more time because the weight of input[2] have to be lowered at the maximum, to get output[0] to converge the good value. meanwhile, this input[2] value is absolutely needed for the correct output of output[1]. this makes the overall convergence time longer.
Here is the snapshot after a 2300000 epochs learning :
The output[1] values are almost the good one. some errors can be seen at the fringe. remember that our hidden and ouput neurons are tanh’d ( output can’t be linear )
LR_IH and LR_HO were 0.05 for this training.
There is the dump of the network’s weights, for information :
Neuron network array
numInputs :3
numHidden :10
numOutputs :2
INPUT to HIDDEN weights:
INPUT[0]:-17.4559 0.0378 0.0916 -1.1608 -2.2167 -15.6072 -14.7210 -16.3537 -1.0468 -25.0423
INPUT[1]:-15.0747 -1.8141 3.3090 -7.1765 1.9960 -8.8084 6.3518 14.2116 6.8359 4.5404
INPUT[2]:0.3854 2.4042 2.6311 -0.1052 13.9876 0.0769 0.1231 0.1707 0.6056 -0.0481
HIDDEN to OUTPUT weights:
OUTPUT[0]:1.4326 -0.0190 0.1235 -0.9902 -0.0714 -1.4984 -1.5762 1.4177 -1.1382 1.7470
OUTPUT[1]:0.0721 1.9509 1.7133 -0.4680 0.7523 -0.0126 -0.0061 0.0173 -0.8408 0.0440

A closer look at the weights shows that input[2] is ‘mainly’ linked to hidden neuron 4 ( weight about 13) and this hidden neuron 4 is then linked to output[1] by a 0.7523 value.

For information, following output is the one for a trained network with :

  1. output[0] following the game of life rules
  2. output[1] outputs the 8 surrounding cell’s value , multiplied by the X coordinates of the cell into the area :

————-

314159265358979323846264338327950288
419716939937510582097494459230781640
628620899862803482534211706798214808

August 12, 2015

neuron networks, part I

Filed under: Uncategorized — Tags: , , , , — admin @ 11:15 am

A neuron network is trained using back propagation learning, to achieve a successfull copying of the game of life rules (conway’s rules)

The neuron network description is :

  • 2 input neurons :
  1. Input 1 is the sum of the 8 surrounding cells at T ( range is 0.0 to 1.0 == 0 to 8 cells )
  2. Input 2 is the output neuron state at T-1 ( namely, the previous state of the output at T) (-1.0/+1.0)
  • 1 output neuron : the new cell’s state for this epoch. , -1.0 / +1.0 range
  • 5 hidden neurons
Network specifications :
  • input/output ranges from -1.0 to 1.0
  • hidden layer neurons uses the sigmoid like function for states : { hiddenval[k]=tanh(hiddenval[k]);  }
  • output layer neurons uses the sigmoid like function with magnifier for outputs :   { out[k]=tanh(out[k]*20.0); }
Many convergences with various weights append. using less than 5 hidden neurons can’t achieve convergence.
More than 5 hidden neurons does not improve convergence at all.
Here is weight dump of achieved convergence, with overall error less than 0.0036 % :
It was achieved after 72mega epochs, with Learning rates Hidden-to-Output of 0.03 and Learning rate Input-to-hidden of 0.3.
Weights,INPUT 1 to Hidden -22.13 -29.4693 14.7538 21.5601 -21.9645
Weights,INPUT 2 to Hidden -17.1351 -19.9037 -6.5521 -3.7809 17.136
Weights,Hidden to OUTPUT 1 1.8878 -1.906 1.8706 -1.3286 1.5681

Needless to say, the problem of the game of life rules as been exposed as : “output depends of the sum of the 8 surrounding cells AND the state of the center cell @ time T”.

A more complex approach is to use 8 input neurons ( the 8 surrounding cells) , plus the input neuron for the T-1 state of the cell.
(See part III for details about this)
Using the surrounding cells sum as an input value is one approach only: it’s a ’simplification’ provided to the neuron network previously described.

The game of life’s rule has two stages :

  1. sum the surrounding cell’s values beein ‘on’
  2. from this sum value, set the output value :
  • range 0..1 is the first solution case
  • range 2 is the second solution case
  • range 3 is the third solution case
  • range 4 to 8 is the fourth solution case
less than 5 hidden neurons will not converge.
We can compute a simple overview of the actual neuron network : 2×5 + 5×1 = 15 ’synapses’ computed.
From Part III of this post series, we have 9×3 + 3×1 = 30 synapses  ( so , its obvious that the summing the surrounding cell’s values and feeding it into the network improves cpu time for computation of an output state. but the setup of part III is somehow the ‘real’ cellular automata model..
The following images illustrate the results of the previous network (pre-computed sum of the surrounding cells )
This network just has different weight results (see last picture for values)
( snapshot of self running , once convergence was ok )
Here is the 64×64 pixels output MAPPING for the output neuron .
X axis (horizontal) is INPUT 1 :
The sum of the surrounding cells,
remapped from (0.0->8.0) to the ( 0.0 ; +1.0 ) range is used as input value.
(left is 0.0, right is  +1.0 (ie there are 8 cells around if it’s +1.0) )

Y axis (vertical) is INPUT 2 :
the network’s output at T-1 ( last output state ).top is -1.0 , bottom is +1.0

Finally, here is a snapshot of the weights used for this results  :
(Top of image : Hidden to Output neuron weights)
(Bottom : input 1, input 2 to Hidden neurons weights)
// … \\
This is a test network description. (to be continued)
314159265358979323846264338327950288
419716939937510582097494459230781640
628620899862803482534211706798214808

cat{ } { post_717 } { } 2009-2015 EIhIS Powered by WordPress