Usage

To use Genotick for trading predictions you have three options:
1) Use our dedicated cloud service. It's free and very simple.
2) Download a zip file here. Then unzip and run with:
java -jar Genotick-version.jar
You need Java 8 installed.
3) Install from sources:
git clone https://github.com/alphatica/genotick
cd genotick
mvn package
cp target/Genotick* /your/path/to/folder

Installation

Genotick doesn't require any installation. Simply unzip the file. A directory 'Genotick' will be created with some example data.

Displaying version info

This one is simple:
java -jar Genotick-version.jar showVersion

Yahoo data

Genotick can fix Yahoo data. Download data to any folder, for example 'yahoo_data' then run:
java -jar Genotick-version.jar fixYahoo=yahoo_data

Genotick will convert YAHOO data format to something more decent. WARNING! It replaces original data.

Preparing data

Genotick accepts data in csv format, one market in one file. There are some basic requirements:
First column needs to be a time point parsable to an integer number (Java's type Long). They must be in ascending order, no duplicate values. It could be a date written in format YYYYMMDD. It could be time written as YYYYMMDDhhmmss and so on. Has to be one number, no colons etc. Dashes are allowed. If you have historical quotes like this (1st Jan 2010, time 3.45 pm, open/high/low/close):

20100101,1545,100,102,99,102
It needs to be changed to
201001011545,100,102,99,102

Second and the rest of the columns need to be numbers parsable to a real number (Java's type Double). Second column is the one that's used to gauge predictions. In case of financial markets - make sure it's something you can trade on, like an open price. Genotick checks prediction as if transaction was opened on next row and closed one after that. For example with data:

20090102,100,101,100,101
20090105,101,103,100,102
20090106,102,104,102,103

Let's say on 2nd Jan prediction was UP. Genotick assumes transaction was opened on 5th Jan at 101 and closed on 6th Jan at 102. Genotick doesn't care nor understands what data it gets. First column must be time, second – price used to trading and the rest can be anything. You can put Bangladesh butter production if you think it's useful. Minimum of two columns is required: time and trade price.

Reversing data

Genotick uses a trick to remove useless systems: prediction must to mirrored on mirrored data. So for example prediction on GbpUsd must be opposite to the one on UsdGbp (well, implementation is simpler than that, but this explanation is good enough for a high level overview). This also helps to remove robots (systems) that are always long or always short. The problem with reversing data is that doing it by hand (or even Excel/LibreOffice) is painful. You can use Genotick to reverse data for you. Let's say that directory 'mydata' contains files 'spx.csv' and 'gold.csv'. Command:
java -jar Genotick-version.jar reverse=mydata
will produce files 'reverse_spx.csv' and 'reverse_gold.csv' with open/high/low/close mirrored and other columns unchanged. These files are ready to be used with Genotick. Then, while training, you can request 'Require symmetrical robots' to remove robots (systems) that don't yield mirrored predictions.

Settings

Quick reference for settings:

StartTimePoint – Time at which Genotick should start its simulation / training

EndTimePoint – End time of simulation / training.

DataDirectory – this is a directory (relative to your current directory) where data files are stored.

PopulationDAO – this tells Genotick where robots' files should be. It must be a directory, must exist and has to be readable. If commented out then training will be done in RAM, so obviously no prediction is possible in this mode (you need a population to make a prediction). Using path to a directory while training will make Genotick a lot slower but will allow for much larger population.

PerformTraining – true. If true Genotick will do full training. If false, only prediction is given (so previous setting 'PopulationDAO' must point to existing population) and robots aren't updated.

PopulationDesiredSize – Desired size for the population. Should be in thousands at least to get satisfactory results. The more the merrier.

ProcessorInstructionLimit 256 – This setting prevents robots (systems) to run forever. Given number is used to calculate maximum instructions that can be executed for a robot on each TimePoint for each data file. Currently algorithm is simply: processorInstructionLimit * robotLength

The higher the number is, the more time it takes to execute a robot but also enables longer running robots before throwing in the towel.

MaximumDeathByAge 0.8 – This setting is used to calculate how many robots is considered to be killed based on their age.

MaximumDeathByWeight 0.8 – This setting is used to calculate how many robots is considered to be killed based on their weight.

ProbabilityOfDeathByAge 0.05 – Probability of killing a robot because it's too old. So let's go through this from the top:

Total population is let's say 5000. MaximumDeathByAge is 0.8. That means that oldest 80% of 5000 will be considered to be killed (4000). Probability is 0.05, so random 5% of those 4000 robots will be killed.

probabilityOfDeathByWeight 0.5 – Probability of a robot getting killed because its weight is too close to 0. Currently, killing by weight happens only if there is no more space to breed in population.

InheritedChildWeight 0.0 – When a child is born is weight is zero because it has no predictions. This setting enables setting child's weight as percentage (in range 0.. 1) of its parents' average weight. This is done to protect young robots from getting killed when they had few predictions.

DataMaximumOffset 64 – This is how far into the past a robot can read data. Set it to something reasonable, depending on your time frame.

ProtectRobotsUntilOutcomes 50 – This is how long a robot is protected (i.e. cannot be killed). Number of outcomes increments by one for every data on every time point. So if you have 5 markets with 5 extra reversed (that's 10), robots will be protected for 5 days only (5*2*5).

NewInstructionProbability 0.2 – Probability of new instruction when making a child.

InstructionMutationProbability 0.6 – Probability of mutating existing instruction when making a child.

SkipInstructionProbability 0.2 – Probability of skipping an instruction when making a child. Better set it very close to newInstructionProbability, otherwise robots will either shrink (i.e. become useless) or grow uncontrollably (will execute forever).

MinimumOutcomesToAllowBreeding 25 – This settings is used to decide whether a robot is old enough to have a child.

MinimumOutcomesBetweenBreeding 25 – This setting is used to decide how soon a robot can have a child again.

KillNonPredictingRobots true – If set to true robots that make no prediction will be removed immediately. Even if they protected by protectUntilOutcomes.

RandomRobotsAtEachUpdate 0.01 – Number of totally new and random robots to be added at each time point (as a fraction of PopulationDesiredSize). Even if population is full.

ProtectBestRobots 0.01 – Elitism. Number of best robots to protect (as a fraction of PopulationDesiredSize). Even if they old and smell funny, as long as they useful.

RequireSymmetricalRobots true – Should be used only if every data file has its reversed equivalent. If set to true and a number of UP and DOWN predictions is not the same – it gets removed immediately. Even if it's protected.

ResultThreshold 1 - This setting allows Genotick to sit on the side line if decision from robots is too close to a tie. Weights for Long and Short are added up separately. Higher of these two numbers is divided by ResultThreshold. If it's still higher than the other, Genotick gives the prediction. If not, Genotick votes to be out of market. Setting below 1 makes no sense.

IgnoreColumns 0 - This tells Genotick to ignore first N columns for learning. It still uses first column (second if you count TimePoint column) for trading. Example column count (date, open, high, low, close, volume):
201602221019910310242
012345
If 'ignoreColumns' is 3 then only 4th and 5th columns (close and volume) will be used for learning.

Running

Java 7 or 8 is required. Genotick can run in two modes:
- training (setting Prediction only = false)
- prediction (setting Prediction only = true)
To make a prediction you need a trained population of . Let's start with training first:
Run Genotick with command:
java -jar Genotick-version.jar
Genotick will ask a series of questions. Press ENTER if you accept default value or enter a new value. Pay attention to 'Start time point' and 'End time point'. By default Genotick goes into training mode (setting 'performTraining' is true). After the last question Genotick starts training. This will take a long time (on default settings and reasonable computer it should be less than an hour). Output will go to genotick_current_date_time.txt file.
At the end of training population will be copied to something like 'savedPopulation_2015_10_15_17_10' dir. Rename it to 'my_population_dir' to use trained system for actual predictions going forward.

Checking predictions

Let's say your last date in your data is 2015/10/27. Run genotick with:
java -jar Genotick-version.jar
Enter your last date as 'start TimePoint' without any spaces. In our example (date 2015 October 27) it should be '20151027' For 'Population storage' enter 'my_population_dir'. For 'PerformTraining ' enter 'false'. Genotick will read systems from 'my_population_dir' directory, check systems' weights and yield a cumulative prediction. There will be no output to the console! Everything will go to 'genotick_current_date_time.txt' file.

Training with random settings

You may be interested to see how Genotick behaves with different settings. Run it with command:
java -jar Genotick-version.jar input=random
to perform training with random settings.

Inputs from a file

Genotick can read its settings from a file. Run with command:
java -jar Genotick-version.jar input=file:path\to\file
An example config file is included in zip.

Output to a file

Genotick can write equity to a file. Command:
java -jar Genotick-version.jar output=csv
Will make Genotick to write equity (in CSV format) to a file named something like profit_21245.csv The number in the file name should be a PID but it's not guaranteed.

Showing systems' info

To see some basic info about systems in a population use command:
java -jar genotick-version.jar showPopulation=directory_with_population
This will print systems' info in a CSV format. To see algorithm behind each system use:
java -jar Genotick.jar showRobot=directory_with_population\system name.prg

You can look at individual systems but you're not suppossed to. Genotick uses wisdom-of-the-crowd to yield cumulative prediction. It does not look at individual systems. You shouldn't either.

Contact

You can contact the author at lukasz.wojtow@gmail.com Any questions, suggestions and feature requests are welcome.