~mwshinn/+junk/neural : contents of project.txt at revision 34

~mwshinn/+junk/neural : (revision 34)
Day/date   Time Description

S 92.11.21 3.0  Talking with CLL
T 92.11.24 1.0  Reading Sci. Amer. article
F 92.11.27 1.0  Talking with CLL
N 92.11.29 4.0  Coding, talking with CLL
M 92.11.30 0.5  Coding: makefile, learn d2c
T 92.12.01 0.5  Makefile, convert to .d and .dh
W 92.12.02 1.0  Fix d2c line numbers, coding main()
R 92.12.03 1.5  Convert to ANSI C, compile pass, remove syntax errors, link
N 92.12.06 2.0  Fix bug in matrix code, talk to CLL on phone, trivial test prog
T 92.12.08 1.5  More coding of test program, first test results
R 92.12.10 0.5  Examing initial test results, trying to check with caculator
N 92.12.13 2.0  Confirm results with calculator, first genetic mutation :-)
S 92.12.19 2.5  Read config file, redesign some structures
N 92.12.20 0.5  Talking to CLL
N 92.12.20 3.0  Recoding kludges, increasing low-level robustness
F 93.01.01 7.0  Finish recoding kludges, running assorted test simulations
N 93.01.03 3.0  Talking with CLL about direction to go from here
R 93.01.07 2.0  Add max generations, rand seed; clean up var names and comments
N 93.01.10 3.5  Creating test frog train file; talking to CLL; expand config
W 93.01.13 1.5  Load mutation matrix & default matrix, suffle code around
S 93.01.16 1.5  Act on variable matrix, create fast index array
W 93.01.20 1.5  Cleaning up random number module, other code; talking to CLL
S 93.01.23 3.5  OrgZero instead of OrgRand; cleanup comments; begin population
W 93.01.27 3.0  Get primitive population working, demes not implemented yet
N 93.01.31 0.5  Allow command-line overrides for some configuration values
N 93.01.31 0.5  Test with various population sizes from 1 to 500; graph results
S 93.02.06 2.0  Clean up comments; fix avail list/org copy bug; add (REAL) casts
N 93.02.07 3.0  Start adding ploidy; fix longstanding normalize bug
S 93.02.27 1.0  Clean up comments
N 93.02.28 1.5  Work on ploidiness
N 93.03.14 3.5  Cut in Dad's new enzyme test; remember di & dj in array alloc
N 93.03.21 3.0  Special test: sections the logistic map x |-> 4x(1-x)
W 93.03.24 0.5  Looking over code listings because I have a weird hang bug
R 93.03.25 0.5  Looking over code listings
N 93.03.28 0.5  Special test: show derivative [F(x+h)-F(x)]/h (h=0.01)
S 93.04.03 6.0  Cleaning up code, talking to CLL
N 93.04.04 0.0  Found hang bug! (not charging time for this one)
M 93.04.05 1.0  Looking over TeX listings of program, marking corrections
T 93.04.06 3.0  Editing changes from last night; adding a few comments; cleanup
N 93.04.11 0.5  Musing over source code listings, marking minor changes
M 93.04.12 1.0  Merge header files/prototypes; cleanup
T 93.05.25 2.5  Helping CLL prepare logistic eqn demo for 5/26/93
N 93.05.30 3.0  Discussing new enzyme rate reaction and diffeq stuff with CLL
xxxxxxxxxxxxxxxxxxxxxxxTHIS LOG FILE WAS ABANDONED AND IS CONTINUED
xxxxxxxxxxxxxxxxxxxxxxxIN THE \ENZYME DIRECTORY!
          _____

          83.0




1/27:  Population of size 10 worked its way down to an error of .081 in 1/10
       the time that it took for a population of size 1.  A population of size
       500 didn't do as well.  A population of 20 was better than 1 but worse
       than 10.  I need to do more tests with other values.  So far, I know:

          Population   1000 iterations   10,000 iterations
          ----------   ---------------   -----------------
               1          0.112477           0.081472
              10          0.084318           0.058627
              20          0.262316           0.056881
              50          0.295875           0.057611
             100          0.345388           0.057158
             500          0.520770           0.118881

1/31:  Ran simulation for populations in the set { 1, 2, 3,..., 100, 110, 120,
       130, ..., 500 }.  Upon graphing with a little BASIC program, it appears
       as though the result after 10,000 iterations is more or less random.
       Perhaps increasing the iteration count to 100,000 or 1,000,000 may give
       different results.

2/3:   Ran simulations for populations in the set { 1, 10, 20, 30, ..., 100 }
       with iteration up to 100,000.  The final result is more or less the same:

          Population   100,000 iterations
          ----------   ------------------
               1            0.060156
              10            0.056903
              20            0.056810
              30            0.057215
              40            0.056805
              50            0.056921
              60            0.056798
              70            0.056807
              80            0.056800
              90            0.057577
             100            0.056797

2/7:   Found a bug in my normalization routine!  It's been present for some
       time.  I was saying
         x[i][0] = Normalize(x[o->ni + i][0]);
       where I should've been saying
         x[o->ni + i][0] = Normalize(x[o->ni + i][0]);
 
       I've rerun the simulation from 2/3.  The results are quite different this
       time:
 
          Population   1000 iterations   10,000 iterations
          ----------   ---------------   -----------------
               1          0.063387           0.004108
              10          0.090438           0.000875
              20          0.176865           0.000444
              30          0.205280           0.001238
              40          0.278453           0.000314
              50          0.087402           0.000788
              60          0.019181           0.003555
              70          0.309792           0.002496
              80          0.197311           0.001805
              90          0.339899           0.018974
             100          0.218765           0.033226
 
 
3/14:  Added a first test of Dad's new physiological network idea -- network
       of enzymes rather than neurons.

       Running a modified FROG441 input file (population = 10), with delta-t
       of 1, 1/10, and 1/100, the following error curve is generated:

                                        
                          dt = 1.0       dt = 1/10       dt = 1/100
          Generation      iter = 2       iter = 20       iter = 200
          ----------     ----------     -----------     ------------
                 1        0.891721       0.871779        0.877966
                10        0.708211       0.753380        0.610064
               100        0.251138       0.437542        0.416405
              1000        0.182123       0.225745        0.069329
              2000        0.170083       0.142968        0.050542
              3000        0.139696       0.097929        0.048708
              4000        0.056180       0.078930        0.048190
              5000        0.038267       0.066958
              6000        0.033789       0.066482        (cancelled;
              7000        0.033730       0.066250         too slow)
              8000        0.033667       0.066245
              9000        0.033446       0.064928
             10000        0.033356       0.064640

       When we used the formula x |-> x + G(x) dt, it worked pretty well.
       Then Dad suggested the formula x |-> x + x G(x) dt and that just sat
       there not lending itself to optimization.  Later Dad said he goofed and
       the first formula was the right one.


3/21:  I seem to have some weird bug with the new stuff I've added recently --
       some input files cause the program to hang when starting.  I'm going to
       make listings and study them to find the bug.  For the tests tonight I
       backed up to a previous version that I trust completely.

       We tested training the logistic equation x |-> 4x(1-x)

       FIRST TEST:
       50 training values in the interval [1/4,3/4]
       100 test values in the range [-1,2]
       Graph looks something like this:


                                                 Not quite parabolic-looking...
                              |                  ...but pretty close in the
                           1 -+-   |  ****   |   training interval.
                              |    |**    *  |
                              |   **       * |
                              |  * |        ****************************    
                              | *  |         | ^
                              |*   |         | This region from 3/4 onward
                              *    |         | is totally flat.  Dad says maybe
                             *|    |         | because 3/4 is a fixed point.
                            * |    |         |
                           *  |    |         |    
       ---|---------------*---+----|---------|----|--------------------|---
         -1              *   0|   1/4       3/4   1                    2
                       **     |
                   ****       |     TRAINING
          *********           |     INTERVAL
                              |
                              |
                              |
                              |
                              |
                          -1 -+- 
                              |

       SECOND TEST:
       50 training values in the interval [1/8,7/8]
       100 test values in the range [-1,2]
       Graph looks something like this:

                                
                              | 
                           1 -+- |      *      |  Much better looking!
                              |  |    ** **    |
                              |  |  **     **  |  But it still doesn't cross
                              |  | *         * |  the origin.  Oh well.
                              |  |*           *|
                              |  *             *
                              | *|             |*
                              |* |             | *
                              *  |             |  *
                             *|  |             |   *
       ---|-----------------*-+--|-------------|--|-*------------------|---
         -1                * 0| 1/8           7/8 1  *                 2
                          *   |     TRAINING          **             ***
                         *    |     INTERVAL            ***      ****
                        *     |                            ******
                       *      |
                      *       |
                     *        |
                   **         |
                  *           |
                **        -1 -+- 
             ***              |
          ***



3/28:  We did a quick test involving the slope of matrix elements -- that is,
       how much the error changes with small changes in the matrix element.
       The "derivative" of an element of the matrix is important because it
       relates to and sheds light on genetics and holistic versus Mendelian
       genes.  We ran the FROG441.IN file -- the error after 25000 generations
       was 0.001097.  As an initial test we picked 0.01 as delta.  Here are the
       matrices (I need to do these automatically rather than by hand):

                                M                                      B
       ----------------------------------------------------------------------

       Variable matrix:

        100000000  0
        010000000  0
        001000000  0
        000100000  0
        xxxx00000  x
        xxxx00000  x
        xxxx00000  x
        xxxx00000  x
        0000xxxx0  x

       Genes after 25000 generations:

        +1.00  +0.00  +0.00  +0.00 +0.00  +0.00  +0.00  +0.00  +0.00    +0.00
        +0.00  +1.00  +0.00  +0.00 +0.00  +0.00  +0.00  +0.00  +0.00    +0.00
        +0.00  +0.00  +1.00  +0.00 +0.00  +0.00  +0.00  +0.00  +0.00    +0.00
        +0.00  +0.00  +0.00  +1.00 +0.00  +0.00  +0.00  +0.00  +0.00    +0.00
       -12.81 -12.42  +1.54  -7.77 +0.00  +0.00  +0.00  +0.00  +0.00    +6.58
        +4.54  +0.44  -1.71 +54.30 +0.00  +0.00  +0.00  +0.00  +0.00    +0.72
        -0.70  +1.71  -1.31  +0.83 +0.00  +0.00  +0.00  +0.00  +0.00    -1.20
        +3.84 -34.48  -0.71 +30.60 +0.00  +0.00  +0.00  +0.00  +0.00    -2.07
        +0.00  +0.00  +0.00  +0.00 -0.01  -0.58  -0.73  -0.21  +0.00    +0.00

       "Importance" of each gene:

       +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000   +0.000
       +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000   +0.000
       +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000   +0.000
       +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000   +0.000
       +0.000 -0.000 -0.000 -0.000 +0.000 +0.000 +0.000 +0.000 +0.000   -0.000
       +0.000 +0.210 +0.185 +0.000 +0.000 +0.000 +0.000 +0.000 +0.000   +0.210
       +0.000 +0.193 +0.176 -0.022 +0.000 +0.000 +0.000 +0.000 +0.000   +0.193
       +0.005 -0.000 +0.005 -0.000 +0.000 +0.000 +0.000 +0.000 +0.000   +0.005
       +0.351 +0.829 +1.047 +0.412 +0.860 +0.747 +0.753 +0.821 +0.000   +0.930

       Histogram (counting only x's in the variable matrix):

         *
         *
         *
         *
         *
         *
         *
         *
         *
         *     |     |     |     |     |     |     |     |     |     |     |
         *     |    *|     |     |     |     |     |     |     |     |     |
         *     |    *|     |     |     |     |     |     |     |     |     |
         *     |    *|*    |     |     |     |     |     |     |     |     |
       -**-----+----*+*----+-----+-----+-----+-----+-----+-----+-*---+-----+---
        0.0   0.1   0.2   0.3   0.4   0.5   0.6   0.7   0.8   0.9   1.0   1.1

                                     [F(x+h)-F(x)]/h


4/4:   Found the nasty bug!!!  In the matrix multiply code I was saying this...

       |  for (i = 0; i < DI(A); i++)
       |  for (j = 0; j < DI(B); j++)
       |  {                ^
       |    REAL s = 0;
       |    for (k = 0; k < DJ(A); k++)
       |      s += A[i][k] * B[k][j];
       |    C[i][j] = s;
       |  }

       ...instead of this...

       |  for (i = 0; i < DI(A); i++)
       |  for (j = 0; j < DJ(B); j++)
       |  {                ^
       |    REAL s = 0;
       |    for (k = 0; k < DJ(A); k++)
       |      s += A[i][k] * B[k][j];
       |    C[i][j] = s;
       |  }

       ...so that explains why the program was running about 10 times too slow
       in the example I was testing with (FROG441.IN).  It was actually running
       9 times too slow, and it was also tromping all over memory -- hence the
       hanging situation.  Program seems stable now, after regression testing
       with 16 input files.
       
       It's *TOTALLY* amazing that the program would actually *RUN* because I
       was tromping all over memory and not doing the matrix operations
       correctly.  That's the second bug now, and in both instances the network
       exibited tolerance and adapted.  Wow!