Developing mobile application predicting stock prices — implementation
The first thought that came to my mind after my decision of dedicating my work to writing such an application is that it is too hard. And it wasn’t about the difficulties with writing AI/ML models predicting stock prices — it was about integrating everything together so that the user could use the application which is really useful, not a simple concept. Here I’m going to show you how cloud helped me in achieving the goal and how I combined the AI models, cloud infrastructure, and mobile application.
In this article, I’ve included the description of the implementation parts of all of the application parts. Feel free to skip parts of the implementation you are not interested in.
So let’s redefine the project scope one more time.
The main goal of an application was to deliver a user-friendly interface of results of AI models used for forecasting cryptocurrencies prices, which could help traders on the exchange. A mobile type of application was chosen, as it is the easiest way of accessing the information on the go. The Android OS was chosen for application development as a desired system.
All the computations couldn’t be performed on the side of a client, because it is not feasible from the performance side. Thus the application must have a server-side, which should be scalable and reliable. The cloud computing platform AWS was chosen, with its’ micro-services infrastructure fulfilling the goal of producing a secure and durable back-end system.
The way two applications will communicate between themselves is REST API.
The main flow of the server-side of the application is presented on the following BPMN diagram:
Such basic flow helped me in defining the functional requirements for my system and as I decided to work with AWS — I had to choose from the different services to meet the requirements.
- need in repetitively happening actions — Cloudwatch Events + AWS Lambda
- storing the data from the exchange — Amazon S3
- training the AI models — AWS Sagemaker
- saving predictions to the database — AWS DynamoDB
And as the main point of interaction is the mobile application, I also needed to create API for the mobile -> backend communication — AWS API Gateway + AWS Lambdas seemed like a good choice here.
Cloud application architecture
Based on the defined requirements, I’ve created a diagram describing the overall AWS architecture:
The diagram is pretty easy to read: going from the left side where we have ConclusionAWS Lambda extracting data from the cryptocurrency exchange API, through a notebook instance which trains AI models and the database with the saved predictions used by the mobile application through the AWS API Gateway and AWS Lambda combination.
Looking ahead, I must say that apart from the fact that such serverless infrastructure is really flexible, it is also really cheap. Actually, if you don’t extract the data every 1 second (which my application wasn’t doing obviously) and the API Gateway load isn’t very high, the only thing you pay here is the Sagemaker Instance. The other services are included in the Free Tier AWS offers.
Let’s look at the services used one more time and specify their advantages and how it is used in this application:
- AWS Lambda functions define the whole logic of an application. These functions are event-driven, so that every time they receive any kind of event, they execute the code inside of them. In our case the AWS Lambda which extracts data from the cryptocurrency exchange is triggered by a cron (time-base job scheduler) event, other lambdas are triggered by queue events.
- When the data is extracted from the cryptocurrency exchange it is stored in AWS S3. It is basically an object storage, different types of files. It offers high read and write speed and can be used by various AWS services.
- Sagemaker was used in the application as the machine learning specified server instance with the Jupyter Notebook on it to build and train models. Sagemaker also supports Lifecycle Configurations. They were used to automatically prepare the desired environment, launch and stop instances of the Jupyter Notebooks.
- AWS DynamoDB is a NoSQL document database service. As AWS Lambdas it is server-less, so no servers or additional software is needed. AWS states that it can support 20 million requests per second and about 10 trillion requests per day.
- The AWS service called API Gateway offers a possibility to create REST API, which could be integrated with Lambda functions or other AWS services. In addition to that, the whole API specification could be defined based on OpenAPI .json file, which can be useful in designing an API based on OpenAPI specification.
- AWS SQS is a simple queuing service that allows easy communication between other services.
- CloudWatch Events Rule helps in defining a time-based scheduler. Rules can be written in cron format, known to UNIX systems users, or easy rate(time) format.
Mobile application architecture
The application architecture from the side of a client is designed to easily and flexibly retrieve data from the server. The MVVM (Model-View-ViewModel) design pattern was chosen as a basis.
The following components were used to define a clean MVVM architecture:
- ViewModel class
- Data binding
The main component here is the ViewModel. Its’ main function is to store LiveData objects and update them when the user requests come. The LiveData is being constantly asynchronously observed from the View level so that any changes could be displayed at any time.
Data Binding helps in reducing the boilerplate code. After view detects changes in LiveData - it usually has to update views to present new data to the user. It is done easily using Data Binding Library.
As Android expects all the API calls should be performed asynchronously. The application achieves it by using asynchronous streams with the help of the ReactiveX library. In that ways, calls are performed in a non-blocking UI mode, so that users can use the application while waiting for a response.
The Retrofit library with its HTTP client helped in defining the API communication by creating interfaces using data class Kotlin models.
A total of four models were created: Linear Regression, XGBoost, LSTM, and NeuralProphet. A brief description of all models used with advantages and disadvantages used is included below. Note: I’m not a specialist in ML/AI, so in some of my statements and considerations I could easily go wrong.
Some data preprocessing work was done to improve the model performance or to fulfill the model requirements. All models except for NeuralProphet used a set of price lags (delays) as the input variables. NeuralProphet model used its own future values constructor method, so no preprocessing was done there.
LSTM required a lot of data preprocessing because of its architecture. The many-to-many architecture was used in the project and the data had to be reshaped into the format [batch_size, timesteps, input_dim]. The NumPy library was used to convert the 2D data into the required 3D tensor. Additionally, the data were scaled to improve the performance of the model.
Models advantages and disadvantages
— Linear regression is the easiest and one of the most popular models in ML. Its general case is called multiple linear regression and it created a relationship model between multiple input variables and an output dependent variable.
- Very fast, can handle very large samples.
- Easy to understand.
- Ineffective when it comes to very complex data.
— Decision-tree-based models can perform well in learning complex data relationships. Referencing the work comparing the performance of different tree-based models (Decision Tree, Bagging, Random Forest, Adaboost, XGBoost) for the stock market prediction, the XGBoost model was chosen as the one with the best performance and the most popular decision-tree based algorithm nowadays in ML.
- Fast — does not require a lot of computational resources.
- Can handle complex structures of the data.
- No need in normalizing data.
- Can be easily overfitted tuning hyperparameters.
— The problem of time-series regression can be solved by recurrent neural networks used in Deep Learning. These types of networks contain feedback loops and allow information to be stored, which is perfect when using historical data. LSTM represents a special type of recurrent neural network architecture capable of learning long-term dependencies. While usual RNNs use one layer in the repeating LSTM use four.
- Do not require a lot of features.
- Vast abilities of tuning.
- As every neural network requires a lot of computational resources.
— A model called Neural Prophet was used in the application as a mixed model — it combines Facebook Prophet decomposable time series model and the AR-net neural network. Neural Prophet supports components like auto-regression, lagged regressors, and other tunable parameters. The main disadvantage of this model is that is a quite new project with a small team that evolved in development, so a few bugs were spotted like improper behavior of the model for time-series data with a frequency other than one day.
- Linear Regression model was implemented using sklearn library. It does not require any parameters specified and can be automatically fitted on pandas DataFrames objects using the fit method.
- XGBoost model was implemented using XGBoostRegressor class from the XGBoost package. The model could also be trained on pandas data frames using the fit method with no parameters specified.
- LSTM model was the hardest one to implement. The Keras library was chosen. A total of 4 hidden layers were created. Another model’s parameters included hyperbolic tangent activation function, adam optimization algorithm, and mean squared error loss function.
- The NeurealProphet was used using the NeuralProphet library. The NeuralProphet model has a lot of parameters to tune, as it can be built on the top of many neural networks. In our project model with 5 hidden layers with 30 lag units were implemented. ar_sparsity parameter was set for regularization of the model.
It is also worth mentioning, that Pulumi was used to deploy the whole infrastructure of the cloud part of the application — using the “Infrastructure As Code (IaC)” principle. A total of 36 resources were defined.
The process of deployment, as well as the unit tests and linters were defined in the GitLab CI/CD.
In this part of the article, I have described the architecture of all of the parts of the application that was developed. As you could have seen, the amount of knowledge needed to make such an application is huge. I have not gone deep into the details of the implementation, as it could overcomplicate the whole thing.
In the next part, I’m going to show you the final results and we’ll see some metrics of models' performance.
Wanna talk about developing cloud application for your innovative project? Contact me!