Selector Introduction

 

Introduction

The Evolutionary Sequencing Machine evolves a population of well-formed-formulas (WFFs) in a problem-oriented grammar known as Selector. Each WFF is known as a selector, and is trained to operate on the data from a single time step. After training the selector will be given the testing data for the testing time step, and the selector will be expected to select the "best" elements from the testing time period.

The Selector grammar is defined, within the ESM Lambda, in esm:selector:%DECLARATION, which is a feature-based grammar specification understood by the parseLib. The esm.selector child Lambda is a "Selector" parser, generated from esm:selector:%DECLARATION by parseLib. The Selector parser translates ASCII text strings into Selector WFFs which are annotated s-expressions. Each WFF s-expression, in the population, may be annotated with grammar notes taken from the rules defined in esm:selector:%DECLARATION.

The Selector language is a dialect of JavaScript, to make learning and reading easy by the current generation of programmers. Each Selector WFF must have the following enclosing syntax to remain gramatically correct.

         
             function (XT) {                        // Select the best Vectors from the set, XT, of Vectors.


             }
             

While a complete layout of Selector is inappropriate right here (the full Selector language is documented later in this reference material), the Selector language does allow the full range of logical and arithmetic operators as well as sorting, filtering, and regression operators against the input data. The input data is always for a single time step with only the fitness function allowed to compare each selector's performance across all training time periods. The argument, XT is an Lambda containing the set of Number Vectors for a single time step as well as offering a number of operators to aid in selecting the best vectors. The expression XT.allRows represents all of the Number Vectors in the current time step. The expression XT.selectedRows represents all of the Number Vectors, in the current time step, currently selected. Each of the Number vectors in XT.allRows contain M+1 elements (known as xtime, and x1 thru Xm to selector Lambdas). The argument XT, when first passed to a Selector Lambda, is always in a state where XT.allRows == XT.selectedRows. The task of the Selector Lambda is to perform selection operations on XT such that, when completed, XT.selectedRows contains the "best" vectors found in XT.allRows. Some examples of simple Selector Lambdas are as follows:

             regress (x10 / sin(x12));          // Train a linear regression for the model "(x10 / sin(x12))" and score based upon the trained model.
             
             svmregress (x10,cos(x12)/log(x3)); // Train a support vector regression for the pseudo variables (x10 and cos(x12)/log(x3) and score based upon the trained model.
             

Lambda Objects

AIS "Lambdas" are a unique type of objects designed to act as the building blocks for intelligent, adaptive, systems. Lambdas contain more than just binary machine code (Analytic Information Server supports many built-in functions, which are primarily binary machine code, but these are not Lambdas). Lambdas are something more than just functions. Lambdas are building block objects, which contain the necessary structure to provide some rudimentary autonomy. Lambdas can contain other child Lambdas and can give birth to other child Lambdas. Lambdas can publish their preferred style of interface. Lambdas have an abstract threshold (like a cell membrane) which makes the Lambda aware of any mutative or referential access attempt from the outside. Lambdas may run on native machine code or they may be emulated by a virtual machine. There may be a different virtual machine for each Lambda. Lambdas contain their persistent and temporary knowledge variables. Lambdas contain the original source code used to compile them. Lambdas can be generated from multiple languages. The Evolutionary Sequencing Machine comes with a built-in Selector compiler, which produces AIS Lambdas.

Lambda Properties

The Lambda object stores Lambda behavior and knowledge in a standard building block format (regardless of the original source language). The Analytic Information Server Lambda object contains the following eight properties:

Av:

The arguments Structure object containing the Lambda's arguments.

In:

The faces: Structure object containing the Lambda's published interface styles.

Pc:

The Pcode Vector object containing the Lambda's virtual machine codes.

Pv:

The pvars: Structure object containing the Lambda's persistent variables.

Cv:

The cvars: Structure object containing the Lambda's persistent class variables

Nc:

The Native Code Vector object containing the Lambda's native machine code.

Sc:

The Source Code Vector containing the original language source for debugger display.

Tv:

The vars: Structure object containing the Lambda's temporary frame variables.

Vm:

The Virtual Machine emulator function (each Lambda may run on a separate virtual machine).

 

An Lambda is First Class Object. A First Class object in Lambda Information Server is any object that is fully exposed, i.e., all of the Structures are visible and modifiable by the programmer. All Lambdas have the following data structures: source code tokens (Sc), pcode tokens (Pc), argument variables (Av), persistent variables (Pv), persistent class variables (Cv), temporary variables (Tv), interfaces (In), native code (Nc), and the virtual machine emulator (Vm). All Lambda structures can viewed and modified by the programmer:

Selector Lambdas

The principal activity of the Selector Lambda is to reduce sets to smaller sets which have a higher score than the original set. This process is called selection. For instance, in the stock market, we start with a set of all possible stocks and we wish to select a few stocks to purchase. If the few stocks we purchase have a higher score (percent profit) than the average of all stocks, then we are happy. The act of selecting a few stocks to purchase reduces the original set of all stocks down to the set of those we wish to purchase. This process is called selection.

Other examples of selection include reviewing a set of all United States households to select only those households which are to receive this month's promotional mailing. Reviewing a set of possible oil deposit sites to select only those sites where we wish to drill. There are many other examples of selection.

Selector Data Types

The Selector Programmer has access to all of the types in the Analytic Information Server environment. These are the same as the Lisp data types. The Lisp data types are divided into three categories: Native Data Types (also known as Immediate types), Objects (heap objects) and Repositories. The Native (immediate) types can be entirely contained within the immediate data of a single Virtual Machine Container. The Objects (heap objects) types are too large to be contained within a single Virtual Machine Container and require extra memory must be managed by the heap manager. Without exception, all of the Object types are identified by an object id. The object id identifies a block of memory, managed by the Analytic Information Server memory manager, in which the Object's data is stored. (see Object Identifier Notation).

Virtual Machine Containers are of fixed length and come in different sizes. Small data items are stored in immediate mode, and may be moved to the heap if the data becomes too large to store immediately.

The Heap contains memory resident data, which is of variable length or is too large to fit in small fixed containers. The Analytic Information Server object Heap manager supports automated object resizing, garbage collection, and anti-fragmentation algorithms so that the user may concentrate on the analysis and modeling of data rather than on memory management.

Repositories (databases) contain persistent data of all sorts. Lambda Information Server supports repositories with multiple database volumes and multiple database schema's including General Object Repositories, Text Repositories, and Lambda Repositories.

The generic Analytic Information Server data type is known to Selector as obj. No type identification, such as var n;, will cause Selector to treat the variable, n, as being of type obj, that is to say any possible Analytic Information Server data type.

The Selector compiler also supports strong typing of declared variables obj. Providing a type identification, such as var int n;, will cause Selector to treat the variable, n, as being of type int, that is to say it will be managed as an Analytic Information Server type Integer.

The following is a list of Selector strong data types together with the Analytic Information Server types which they represent.

obj Object bool Boolean char Character int Integer float Number
text Text string String symbol Symbol bytvec ByteVector fltvec FloatVector
stc Structure dir Directory dic Dictionary matrix Matrix nummat NumMatrix
vec Vector bitvec BitVector numvec NumVector intvec IntVector objvec ObjVector

Strong Typing

The Selector Programmer has access to compile time strong typing variable declarations. Strongly typed variables are compiled with Lambda Information Server's strong typed virtual machine instructions. Strongly typed variables operate faster, at run time; but, are more prone to programmer error as there is little or no run time type checking performed.

The programmer can even cast an arbitrary Selector expression to a valid type. The casting will alert the Selector compiler to treat the result of the cast expression as specified. This will direct the Selector compiler to use Analytic Information Server's strong typed virtual machine instructions with the cast expression. Warning: casting does not introduce any run time type checking.

The following Selector code sample illustrates the actions of the Selector compiler when strong typing variable declarations and type casts are encounteres.

The Selector source code for foo

		      // A test of strong typing, including expression type casting, in Selector.
		      function foo(int i) {
		      var char c1, string name=new('String',"Hello There");
		      c1 +=name[((int)length(name))-i];
		      }

The compiled code for foo

Virtual Machine Instructions for: foo
0000: push "String,"Hello There"
0007: call 2,new,vars:(name)
0011: push vars:(name)
0013: call 1,length,vars:(__T4)
0017: isub args:(i),vars:(__T4),vars:(__T3)
0021: refstring vars:(__T3),vars:(name),vars:(__T2)
0025: cadd vars:(__T2),vars:(c1),vars:(c1)
0029: return vars:(c1)

White Space

The Selector compiler uses white space to separate each of its symbols and operators. The Selector white space characters include all the standard 8-bit ASCII control characters (less than 32 decimal), and the blank character (32 decimal).

LF, CR, TAB ..control chars..

space

The Selector compiler ignores whitespace

a = 1 + 2;  // This is a valid statement
		       b=1+2;  // This is also a valid statement

Special Characters

Selector uses the standard 8-bit ASCII character set. Some of the Selector special characters serve to group a set of characters as a single unit (e.g. double quotes group characters to form a string constant). The remainder of the special characters serve to separate tokens (e.g. comma or blank) or prefix a constant (e.g. $ # ).

The following are the Selector special characters.

Naming Conventions for Variables

Selector variable names are composed of case-sensitive alphanumeric characters. No spaces are allowed in a variable name but the underscore (_) character may be embedded to separate multi-word names . Another convention to make multiple word names more readable its to use start the first word with a lowercase letter and begin the first letter of each succeeding word with an uppercase letter.

For example

myVariable
sum
namesOfStudents

Constants

Selector is a dynamically typed language. The type of a variable is unknown until runtime when data is stored into it. The follow table contains the constant forms recognized by the Selector compiler. For more detail on the data types listed below, see Analytic Information Server Programmer's Guide.

\ | ( ) [ ] { } # @
' ' , " : ; $ % .  
Type Constant Form
Void void or nil
Boolean true or false
Date #Mar,2,1987 or #Jun,1,200BC
Integer 12 or -2345
Number 12.9 or 0.123456
Object #<Vector 1273>
String "Hello World"
Symbol 'Hello'

Comments

Because the Selector compiler tries to evaluate all of the words in a script, it is useful to have text, which is to be ignored by the compiler. This ignored text, called a comment, allows you to include information, which may be useful to understanding the Selector statements. There are two types of comments: single line and multi-line.

A single line comment tells the compiler to ignore all the characters up to the end-of-line (eol). A single line comment must begin with the characters: //

For Example:

// This is a comment

A multi-line comment tells the compiler to ignore all the characters embedded in between the delimiters: /* and */

For Example:

/*  Humpty Dumpty 
		
		            sat on a wall  */

Global Variables

Selector variables have automatic global declaration. Referencing a symbol, which has not already been declared, automatically causes it to be declared as a global variable. This feature has been added to make Selector user-friendlier and to make Selector consistent with other Analytic Information Server languages.

The following Selector expressions are equivalent (The assumption is made that X has not already been referenced).

X = 23

is equivalent to:

var X = 23

Selector global variables are valid during the whole life of the current workspace (see the _globals global symbol table variable). Selector global variables are referenced by specifying the symbol. In addition to user defined globals, Selector global variables include all of the built-in functions such as + - * upperCase, sin, cos, date, etc.

The Analytic Information Server Selector dialect is specified as case-sensitive (most dialects of Selector are case-insensitive). Therefore

Var

is NOT equivalent to:

var

Function Calls

Any user-defined Selector Lambda, and Lisp Lambda, and any Lambda Information Server function may be called from Selector. The syntax is simple the function name followed by parenthesis, (). If the function requires arguments, they must be supplied in between the parenthesis and multiple arguments should be separated by a comma. The parenthesis are mandatory even if no arguments are supplied. All Selector functions receive arguments by value. After function invocation, one and only one result value is returned.

For example

                   mod(10, 2);     //Returns 0
		
                   today();        //Returns 729855