An open source project now under Apache’s guidance uses a template system for easy training and deployment of Spark-powered machine learning models

LAGMAN 0 Comments

The Apache Foundation has added a new machine learning project to its roster, Apache PredictionIO, an open-sourced version of a project originally devised by a subsidiary of Salesforce.

What PredictionIO does for machine learning and Spark

Apache PredictionIO is built atop Spark and Hadoop, and serves Spark-powered predictions from data using customizable templates for common tasks. Apps send data to PredictionIO’s event server to train a model, then query the engine for predictions based on the model.

Spark, MLlib, HBase, Spray, and and Elasticsearch all come bundled with PredictionIO, and Apache offers supported SDKs for working in Java, PHP, Python, and Ruby. Data can be stored in a variety of back ends: JDBC, Elasticsearch, HBase, HDFS, and their local file systems are all supported out of the box. Back ends are pluggable, so a developer can create a custom back-end connector.

How PredictionIO templates make it easier to serve predictions from Spark

PredictionIO’s most notable advantage is its template system for creating machine learning engines. Templates reduce the heavy lifting needed to set up the system to serve specific kinds of predictions. They describe any third-party dependencies that might be needed for the job, such as the Apache Mahout machine-learning app framework.

Some existing templates include:

A universal recommendation engine.
Text classification.
Survival analysis (for time-between-failure predictions).
Labeling topics using Wikipedia as a knowledge base.
Similarity analysis.

Some templates also integrate with other machine learning products. For example, two of the prediction templates currently in PredictionIO’s gallery, for churn rate detection and general recommendations, use H2O.ai’s Sparkling Water enhancements for Spark.

PredictionIO can also automatically evaluate a prediction engine to determine the best hyperparameters to use with it. The developer needs to choose and set metrics for how to do this, but there’s generally less work involved in doing this than in tuning hyperparameters by hand.

When running as a service, PredictionIO can accept predictions singly or as a batch. Batched predictions are automatically parallelized across a Spark cluster, as long as the algorithms used in a batch prediction job are all serializable. (PredictionIO’s default algorithms are.)

Where to download PredictionIO

PredictionIO’s source code is available on GitHub. For convenience, various Docker images are available, as well as a Heroku build pack.

0 0 votes

Article Rating

Receive Job Alerts via Our Social Media Channels:

Telegram Lagmen Net job Alert
X Lagmen Net job Alert
Facebook Lagmen Net job Alert
Instagram Lagmen Net job Alert

Join Our WhatsApp Groups

Lagmen Limited Job Alert 1
Lagmen Limited Job Alert 2

Submit Your Discover News

discovernews@lagmen.net
reachus@lagmen.net Send us an update or tip via WhatsApp: 07060528734

Contact Us Now

Tel: +2348051324267
Tel: +2348094097992

0 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

[…] Prepare for next recruitment exams by reviewing past FIRS aptitude test questions and answers. The download is free and…

[…] FIRS Aptitude Test Questions and Answers (Sample) – Free PDF […]

[…] Nigerian National Petroleum Corporation (NNPC) manages Nigeria’s petroleum resources. To attract outstanding people, the corporation pays competitive entry-level salaries.…

things happen Around the world so sad!!! Thank you so much for letting me express my feeling about your post.…

Thanks for sharing your information